Abstract
Combinational therapy that combines two or more therapeutic agents is very common in cancer treatment. Currently, many clinical trials aim to assess feasibility, safety and activity of combinational therapeutics to achieve synergistic response. Dose-finding for combinational agents is considerably more complex than single agent, because only partial order of dose combinations' toxicity is known. Prototypical phase I designs may not adequately capture this complexity thus limiting identification of the maximum tolerated dose (MTD) of combinational agents. In response, novel phase I clinical trial designs for combinational agents have been extensively proposed. However, with so many available designs, studies that compare their performances and explore the impact of design parameters, along with providing recommendations are limited. We are evaluating available phase I designs that identify a single MTD for combinational agents using simulation studies under various conditions. We are also exploring the influences of different design parameters and summarizing the risks/benefits of each design to provide general guidance in design selection.
Keywords: Phase I clinical trial, combinational agents, dose-finding, guidance for phase I
1. Introduction
Clinical trials investigating combinational therapies that combine two or more therapeutic agents have garnered renewed attention with the development of cancer therapy. Novel combinational agents require identification of a maximum tolerated dose (MTD). The MTD is the highest dose level that leads to a pre-specified target toxicity probability. However, dose-finding for combinational agents could be challenging as we do not know the complete order of dose combinations' toxicity. To fill the gap, several phase I study designs for combinational agents have been proposed.
Overall, there are 3 categories of designs: algorithm- or rule-based, model-based, and model-assisted. Algorithm-based designs do not involve any parametric relationship between dose combinations and their toxicity probabilities. Ivanova and Wang [13] proposed an up-and-down design and used isotonic regression to estimate the MTD. Later Ivanova and Kim [12] updated the previous up-and-down design using T-statistics. Lee and Fan [16] proposed a two-dimensional search algorithm to identify MTD. Algorithm-based designs usually lack statistical theory foundation, and their escalation/de-escalation rules are ad-hoc. Therefore, their performances are not guaranteed.
Model-based designs assume a parametric dose-toxicity relationship. To account for the design parameters' estimation uncertainty in the beginning of the trial, a start-up phase is usually used before transitioning to model-based part. The main differences among model-based methods are the choice of dose-toxicity relationship and the scheme of start-up phase. Thall et al. [30] proposed to identify MTD contour with a six-parameter logistic regression. Wang and Ivanova [36] proposed a three parameter model to link doses and toxicity probabilities. Yin and Yuan proposed a latent contingency table method [38] and another one that used Copula to model toxicity probabilities through marginal toxicity profile of individual agents [39]. Riviere et al. [28] developed a method based on Bayesian logistic regression. Braun and Jia [2] used a proportional odds logistic regression fitting model within each ‘row’ of the dose combination matrix, and later join them together. Braun and Wang [3] proposed a hierarchical model through linking the effective doses with hyperparameter of dose toxicity probabilities. Tighiouart et al. [31] extended the escalation with overdose control (EWOC) method [1] to two-dimensional setting to identify the MTD curve. Some other designs were motivated by the main difficulty of two-dimensional dose-finding where only the partial order of dose combinations' toxicity is known. In this case, we only know that doses and are more toxic than dose , but we are not aware of the toxicity order of dose and , where denotes the dose combination of the dose of one agent and the dose of the second agent. Therefore, several possible toxicity orderings which satisfy the partial order exist. Conaway et al. [5] proposed to identify all possible orderings of dose combination toxicities so that the two dimensional dose-finding can be solved by continuous reassessment method (CRM) [23]. Building upon this methodology, Wages et al. [33,34] proposed to use a subset of possible orderings, which is more feasible especially when the total number of possible orderings is large. To avoid pre-specifying orderings, Lin and Yin [17] proposed to dynamically update the ordering. However, model-based designs have several limitations in practice: (1) they are relatively complicated and require constant model updating by statisticians, which posit barriers to clinicians to understand and implement. (2) Most designs need parameter calibration. (3) Some designs require prior knowledge about agents (e.g. guesses of dose combinations' toxicities) which is not easy for clinicians to provide.
Model-assisted designs still utilize statistical models in decision making, but focus on easier implementation through pre-tabulating escalation and de-escalation rules before trial conduct [37]. Therefore, they have the advantages of algorithm-based and model-based designs. BOIN design [20,40] and Keyboard design [37] are representative model-assisted designs. Later, Lin and Yin [18] extended the BOIN design, Pan et al. [24] extended the Keyboard design to handle two-dimensional dose-findings.
In addition, there are designs that incorporate special features while conducting two dimensional dose-finding. Liu and Ning [19] proposed a design that is able to handle trials with delayed toxicities. Diniz et al. [6] built a Bayesian design for combinational doses upon escalation with the EWOC method [1] accounting for patient heterogeneity through taking baseline covariates into consideration.
With so many novel methods available yet very limited implementation in oncology trials, we have relatively little knowledge about which methods are superior under which scenarios. Riviere et al. [26] compared six phase I designs for combinational agents that are either algorithm-based or model-based. Their paper claimed that all designs were optimized to improve the percentage of correct MTD selection before comparison. However, such claim itself is questionable as it is impossible to achieve optimization under diverse scenarios using a universal set of parameters. Based on the sensitivity analyses the authors have conducted, different scenarios actually required different parameter sets to achieve optimization. Therefore, simply comparing results using single set of design parameters makes the comparison less meaningful. Another limitation of this study is that it did not discuss in detail about the influences of different design settings, although sensitivity analyses were presented. Hirakawa et al. [10] compared performances of five model-based designs for combinational agents. But this paper did not explore effects of design parameter beyond cohort size. Harrington et al. [8] reviewed some algorithm-based and model-based combinational agents designs, and discussed their advantages and limitations without simulation studies. None of the above papers included the recently developed model-assisted designs as they were published in earlier days. Moreover, the feasibility of parameter tuning and the influences of design parameters for the model-based designs were investigated in a limited fashion. Later Pan et al. [24] compared two-dimensional BOIN, two-dimensional Keyboard, and Continual reassessment method for partial ordering (POCRM) [33,34]. However, POCRM was the only model-based design in the comparison. Moreover, similar to those review studies mentioned above, this one did not explore effects of design parameters for POCRM. To provide a more recent view of phase I clinical trial designs for combinational agents, we conducted a simulation study to evaluate the performances of various designs under comprehensive scenarios.
Our study is different from previous review papers in several aspects: (1) we included two recently-developed model-assisted designs in the study, (2) for design parameters with no clear recommendations, we investigated multiple sets to investigate their influences instead of using single subjectively selected set, (3) we used different sample sizes in simulations to check that whether our findings are valid with different trial sizes, and (4) in addition to the summary of each design's characteristics, we discussed putative reasons that led to their performances.
Specifically, in this paper we focus on the designs that (1) utilize toxicity information only, (2) identify single MTD instead of MTD contour or curve, (3) assume monotonic dose-toxicity relationship within each individual agent, and (4) having programming codes/softwares available. As a result, we have selected 9 designs: Dose finding in discrete dose space [36], Bayesian dose finding by copula regression [39], Continual reassessment method for partial ordering [33,34], Hierarchical Bayesian Design [3], Logistic model-Based Bayesian dose finding design [28], Generalized Continual Reassessment Method [2], Bootstrap Aggregating Continual Reassessment Method [17], combinational Bayesian optimal interval design [18], and combinational Keyboard design [24]. We did not include few currently available algorithm-based designs as it is consensus that algorithm-based designs usually have inferior performances compared with model-based ones [14,22,26].
The rest of the paper is organized as follows: we first reviewed nine designs that will be included in our evaluation; then presented our simulation studies and results; in the last, we discussed our findings.
2. Review of designs
2.1. Notations
Here we define some common notations used in these methods. As most of the designs we selected apply to dual-agent dose finding only (Copula has been extended to handle more than two agents), we assume two agents A and B, with J and K doses, respectively. Define to be the true toxicity probability of the dose combination , , ; define to be the true toxicity probability of agent A when used as a monotherapy, , and to be the true toxicity probability of agent B when used as a monotherapy, . Define ϕ to be the pre-specified target toxicity probability. Define N to be maximum sample size in the trial. Define to be number of subjects that received dose and to be number of dose-limiting toxicities (DLTs) observed among those patients.
2.2. Model-based: dose finding in discrete dose space (I2D) [36]
This method is a Bayesian design that extends CRM to accommodate dose-finding in dual agents.
Dose-toxicity model:
| (1) |
where is a vector of unknown design parameters and restricts to satisfy the assumption of toxicity monotonicity; and are constants instead of actual doses of agents. If no interaction between two agents exists in Equation (1), the model becomes
| (2) |
Start-up phase: the trial is initiated with a dose combination . Next, escalate agent A while agent B is maintained at the lowest dose if no DLT is observed. If still no DLT is observed when agent A reaches its maximum dose, agent B is escalated to its second lowest dose combining with agent A's dose. Then if no DLT is observed, we continue to a combination where agent B is at its third lowest dose and agent A's dose, namely the one used in previous combination minus 2. When agent B reaches its maximum dose, if agent A is at its dose, we evaluate all combinations from . The start-up phase ends if at least one DLT is observed at any time.
Post start-up escalation/de-escalation dose set: if the current combination is , I2D only considers doses to the next subject prohibiting diagonal moves.
Post start-up trial conduct: after the start-up phase ends, the working model equation (1) or Equation (2) will be used to obtain toxicity estimates of all dose combinations. Due to safety concerns, the working model starts at the combination dose where agent B is at its lowest dose and agent A is at the dose that makes the combination's estimated toxicity probability closest to ϕ.
MTD determination: the dose combination whose posterior probability of toxicity is closest to ϕ will be selected as the MTD.
2.3. Model-based: Bayesian dose finding by copula regression Copula [39]
This method utilizes copula to model the dose-toxicity relationship because copula allows to link the joint distribution and marginal distributions via a dependence parameter.
Dose-toxicity model:
| (3) |
where α and β are power parameters as in CRM to accommodate the uncertainty, and represents the interaction between two agents. Intermediate informative prior distributions with prior mean 1 and a relatively small variance will be assigned to α and β (e.g. Gamma (2,2)). A Gamma distribution with a large variance is usually chosen as the non-informative prior for γ. If only one agent is involved, this approach reduces to regular CRM.
Start-up phase: the start-up phase begins with the lowest dose combination . It proceeds vertically until the first toxicity is observed, then it proceeds horizontally until the first toxicity is observed. Once one toxicity is observed in both directions, the formal design starts.
Post start-up escalation/de-escalation dose set: if the current combination is , the dose escalation set is defined as , dose de-escalation set is define to be . As we only know the partial order of dose toxicity in combinational agents, we do not know whether dose combinations and are more/less toxic than dose combination . Therefore, the authors included and in both escalation and de-escalation sets.
Post start-up trial conduct: this design involves two parameters and that represent the fixed probability cut-offs for dose escalation and de-escalation, respectively, and . Detailed algorithm is laid out as below.
if at current dose combination , , we will escalate to the dose that belongs to and with the toxicity probability closest to ϕ and higher than that of . If current dose is , then stay at the same combination.
If at current dose combination , , we will de-escalate to the dose that belongs to and with the toxicity probability closest to ϕ and lower than that of . If current dose is , the trial is terminated.
Otherwise, stays at the same dose combination.
MTD determination: after N subjects are exhausted, the MTD is determined as the dose combination with the estimated probability of toxicity closest to ϕ.
2.4. Model-based: continual reassessment method for partial ordering (POCRM) [33,34]
To solve the problem of only knowing partial order of dose toxicity, POCRM proposes to pre-specify a subset of possible orderings, then utilize the CRM on each of them. This way, two-dimensional dose-finding is reduced to a one-dimensional problem.
Dose-toxicity model: define T as the total number of dose combinations, ; as the dose assigned to subject n; as the true toxicity probability of ; as a binary indicator of whether subject n has toxicity or not, ; and as data collected after having n subjects where . Assume we have M possible partial ordering in total. For a specific ordering m, , is modeled as below, similar to the CRM
| (4) |
where is some working dose-toxicity model, a is the model parameter. After having n patients, the likelihood under partial order m is
| (5) |
Then, the estimate of parameter a under ordering m, , could be obtained through maximizing Equation (5). Then we could obtain the posterior probability of partial order m:
| (6) |
Start-up phase: when POCRM was first proposed, it was a single stage method [33]. Later it was extended to include a start-up phase [34]. The start-up phase partitions the dose combination matrix to different ‘zones’ and starts the first cohort with zone 1, which is the lowest dose combination. If no DLT is observed, then it assigns next cohort to doses in zone 2. If there are multiple combinations in zone 2, it randomly selects one of them. If no DLT is observed, it continues to assign the next cohort to other combinations in the same zone. Moving to the next zone is only allowed when all the dose combinations have been explored in lower zones. The start-up phase ends when one DLT is observed. POCRM also allows users to specify their own scheme in the start-up phase as they see appropriate.
Post start-up escalation/de-escalation dose set: since POCRM pre-specifies a subset of possible orderings, the escalation and de-escalation dose are known within each ordering.
Post start-up trial conduct: after the start-up phase ends, the authors used weighted randomization to select the partial ordering with from Equation (6) being the weight. After selecting the working partial ordering m, is estimated for all through Equation (4) and assign the dose combination that minimizes to the next subject. But for final MTD determination, the ordering with the maximum posterior probability will be chosen among all candidate orderings.
MTD determination: after N subjects are exhausted, the MTD is determined as the dose combination with the estimated probability of toxicity closest to ϕ, given the ordering with maximum posterior probability.
2.5. Model-based: hierarchical Bayesian design (Hierarchy) [3]
Dose-toxicity model: This method employs a hierarchical model:
| (7) |
| (8) |
| (9) |
where follows a multivariate normal distribution with mean and variance covariance matrix where is a identity matrix; follows a multivariate normal distribution with mean and the same variance covariance matrix; and are ‘effective doses’ instead of actual clinical values. This method omits the interaction effects between two agents.
The authors provided recommendations about selecting priors and methods to calculate ‘effective doses’. They used the fact that to obtain the solutions for and :
They suggested setting and selecting . They set , then
where
Therefore, this design needs inputs of and , from the clinicians.
Start-up phase: this method is a single stage design without a start-up phase.
Escalation/de-escalation dose set: if the current combination is , acceptable dose combination set to the next subject S is defined as , , , , , , , , . As dose combinations and are included, this design allows simultaneous dose escalation or de-escalation of both agents.
Trial conduct:
Compute a CI for overall toxicity rate from cumulative data of currently recruited subjects.
If the lower bound of this CI is greater than ϕ, terminate the trial.
If the lower bound of this CI is less than or equal to ϕ, use all previous information to obtain posterior mean of , .
Select a dose that belongs to set S and is closest to ϕ. Assign this dose combination to the next patient.
Continue until all N subjects are exhausted.
MTD determination: Use the outcomes and assignments of all N subjects to derive posterior mean of , . If the last subject received dose , then the dose combination that is among set S of and with estimated toxicity closest to ϕ will be selected as the MTD.
2.6. Model-based: logistic model-based Bayesian dose finding design DFCOMB [28]
Dose-toxicity model: this method uses logistic regression to link doses and toxicities of the two agents:
| (10) |
where and are ‘effective doses’ instead of actual clinical values, , , , to ensure monotonicity. The authors define ‘effective doses’ as , and recommend a vague normal prior for and , an informative prior exp for and .
Start-up phase: the start-up phase starts from dose . If no toxicity is observed, escalate the dose along the diagonal until at least one agent reaches maximum dose. If still no toxicity is observed when one agent reaches its maximum dose, we increase the dose of the other agent until both agents reach maximum doses. The start-up phase ends once the first toxicity is observed and the model-based design starts.
Post start-up escalation/de-escalation dose set: if the current combination is , the dose escalation set is defined as , dose de-escalation set is define as . As we only know the partial order of dose toxicity in combinational agents, we do not know whether dose combinations and are more/less toxic than dose combination . Therefore, the authors included and in both escalation and de-escalation sets.
Post start-up trial conduct: in the model-based design part, the escalation and de-escalation rule is the same as in Copula design [39].
MTD determination: DFCOMB utilizes a different method to identify MTD after the trial is completed. The dose combination that has the largest posterior probability and is used to treat at least one cohort will be selected as the MTD. Parameter δ is the length around the target toxicity probability.
2.7. Model-based: a generalized continual reassessment method gCRM [2]
This method is another generalization of the CRM.
Dose-toxicity model: it uses proportional odds logistic regression to model the dose-toxicity relationship:
| (11) |
where is agent B specific intercept, ; β is a common coefficient across models; is the ‘effective dose’ of agent A, . For example, if agent B has three dose levels, then gCRM will need three models: , , and . Later these ‘sub’ models will be aggregated together through a joint prior distribution that forces correlation among . As observed, gCRM assumes no interaction between two agents as well.
In terms of parameters and β, their paper assumes that β follows a Gamma distribution with mean and variance , , and defines for so the joint distribution of is multivariate normal. If one assumes that , can be obtained. One can approximately obtain . The authors recommend setting , , . Therefore, with the inputs of and for from clinicians, all parameters can be calculated.
Start-up phase: this method is a single stage design without a start-up phase.
Escalation/de-escalation dose set: if the current combination is , define acceptable dose combination set to the next subject S to be , , , , , , , , . As dose combination and are included, this design allows simultaneous dose escalation or de-escalation of both agents.
Trial conduct:
Treat the first patient at dose .
For patient , compute from logit where and are posterior means.
As the posterior distribution of will be updated constantly, check that whether the stopping rule of has been reached. If yes, then terminate the trial; otherwise assign next subject to the dose in set S where is closest to ϕ.
Continue until all N subjects are exhausted.
MTD determination: if the last subject received dose , then the dose combination that is among set S of and with estimated toxicity closest to ϕ will be selected as the MTD.
2.8. Model-based: bootstrap aggregating continual reassessment method bCRM [17]
Bootstrap aggregating CRM is similar to POCRM as they both use one-dimensional CRM to identify the MTD. However, bCRM keeps updating the toxicity ordering of dose combinations rather than pre-specifying them.
Dose-toxicity model: in bCRM, it assigns a beta prior to and obtain its posterior mean . Then bCRM applies two-dimensional pool-adjacent-violators algorithm (PAVA) [4] on to obtain to ensure that these estimates satisfy partial ordering. To avoid ties among , a term is added, where is the rank of dose and ϵ is a small positive number. The resulted estimates are denoted as and, as a result, one can obtain a new ordering . As noted by the authors, such orderings could vary dramatically due to data sparsity. Therefore, they bootstrapped B samples of data to obtain corresponding orderings , and toxicity probability estimates , . The final estimate of is
| (12) |
where represents cumulative data up to subject, indicates whether subject i experienced DLT or not; indicates dose combination subject i received, .
Start-up phase: the start-up phase is similar to DFCOMB.
Post start-up escalation/de-escalation dose set: similar to POCRM, bCRM uses one-dimensional CRM in the dose-finding process, therefore, the escalation and de-escalation dose is certain within each ordering.
Post start-up trial conduct: the trial conduct procedures are similar to DFCOMB.
MTD determination: After the trial is completed, one could select the combination that has been administered to patients and has the largest posterior probability of falling into the ε-neighborhood of ϕ, where ε is a small positive number.
2.9. Model-assisted: combinational Bayesian optimal interval design cBOIN [18]
Combinational BOIN is a model-assisted design that is generalized from the single agent BOIN design [20,40].
Dose escalation and de-escalation rule: BOIN mainly involves two important parameters and which are lower and upper cut-offs. At current dose j, the escalation and de-escalation rules are below:
if , then the next cohort stays at current dose;
if , then the next cohort escalates to dose j + 1;
if , then the next cohort de-escalates to dose j−1;
where is the estimated toxicity probability of dose j in single agent dose-finding and it is simply proportion of patients experiencing toxicities among those who receive dose j.
In two-dimensional dose-finding, is calculated the same way: .
An important task of cBOIN is to determine and . Through minimizing the probability of incorrect movement given data at current dose,
The authors suggested using and through their simulation calibration.
Start-up phase: this method does not have a start-up phase.
Escalation/de-escalation dose set: admissible dose escalation set is defined as , admissible de-escalation set is define as .
Trial conduct:
Treat the first cohort at dose .
Suppose that current dose is combination . If , escalate to the dose combination that belongs to and has the largest .
If , de-escalate to the dose combination that belongs to and has the largest .
Otherwise if , stay at current dose.
Dose combinations with will be permanently excluded, where λ is pre-specified threshold probability. If dose combination satisfies this stopping rule, the trial will be terminated early.
Continue until all N subjects are exhausted.
MTD determination: after the trial is completed, isotonic regression will be used on to obtain estimator so that they satisfy monotonic dose-toxicity within one agent when fixing the other agent's dose. The MTD is the dose combination with closest to ϕ.
2.10. Model-assisted: combinational keyboard design cKeyboard [24]
Similar to combinational BOIN, combinational Keyboard is a model-assisted design as well. Combinational Keyboard design starts with specifying a target toxicity interval , where and are tolerable deviations from ϕ. This interval is called target key. Then a series of equal-width keys are identified along both sides of the target key.
Dose escalation and de-escalation rule: in the setting of the single agent design, the escalation and de-escalation rules are straightforward. Define to be the strongest key based on the posterior distribution of current dose j,
if , then next cohort escalates to dose j + 1;
if , then next cohort stays at current dose;
if , then next cohort de-escalates to dose j−1;
To address two-dimensional dose-finding, the authors define five strategies of admissible escalate and de-escalate sets. After simulations, the strategy whose admissible escalation and de-escalation sets are the same with combinational BOIN design is recommended.
Start-up phase: this method does not have a start-up phase.
Escalation/de-escalation dose set: several dose assignment algorithms have been proposed for Keyboard design and the authors recommend to define admissible dose escalation set to be , admissible de-escalation set to be .
Trial conduct:
Treat the first cohort at dose .
Suppose that current dose is combination . If , escalate to dose combination that belongs to and has the largest .
If , de-escalate to dose combination that belongs to and has the largest .
Otherwise if , stay at current dose.
Dose combinations with will be permanently excluded, where λ is pre-specified threshold probability. If dose combination satisfies this stopping rule, the trial will be terminated early.
Continue until all N subjects are exhausted.
MTD determination: after the trial is completed, isotonic regression will be used to identify the MTD.
3. Simulation studies
In the simulation studies, our goal is to identify single MTD of two combined agents. Simulation settings are borrowed from previous studies [9,26] and shown in Table 2. The target toxicity probability is 0.3. All designs started with the lowest dose combination. Cohort size was set to be 3 for all designs that use cohorts as dose assignment unit, unless otherwise specified. Two thousand simulation runs were generated for each scenario.
Table 2.
Simulation settings.
| Agent A | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Agent B | 1 | 2 | 3 | 4 | 5 | 1 | 2 | 3 | 4 | 5 | 1 | 2 | 3 | 4 | 5 |
| Scenario 1 | Scenario 2 | Scenario 3 | |||||||||||||
| 1 | 0.05 | 0.1 | 0.15 | 0.3 | 0.45 | 0.15 | 0.3 | 0.45 | 0.5 | 0.6 | 0.02 | 0.07 | 0.1 | 0.15 | 0.3 |
| 2 | 0.1 | 0.15 | 0.3 | 0.45 | 0.55 | 0.3 | 0.45 | 0.5 | 0.6 | 0.75 | 0.07 | 0.1 | 0.15 | 0.3 | 0.45 |
| 3 | 0.15 | 0.3 | 0.45 | 0.5 | 0.6 | 0.45 | 0.55 | 0.6 | 0.7 | 0.8 | 0.1 | 0.15 | 0.3 | 0.45 | 0.55 |
| Scenario 4 | Scenario 5 | Scenario 6 | |||||||||||||
| 1 | 0.3 | 0.45 | 0.6 | 0.7 | 0.8 | 0.01 | 0.02 | 0.08 | 0.1 | 0.11 | 0.05 | 0.08 | 0.1 | 0.13 | 0.15 |
| 2 | 0.45 | 0.55 | 0.65 | 0.75 | 0.85 | 0.03 | 0.05 | 0.1 | 0.13 | 0.15 | 0.09 | 0.12 | 0.15 | 0.3 | 0.45 |
| 3 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 0.07 | 0.09 | 0.12 | 0.15 | 0.3 | 0.15 | 0.3 | 0.45 | 0.5 | 0.6 |
| Scenario 7 | Scenario 8 | Scenario 9 | |||||||||||||
| 1 | 0.07 | 0.1 | 0.12 | 0.15 | 0.3 | 0.02 | 0.1 | 0.15 | 0.5 | 0.6 | 0.005 | 0.01 | 0.02 | 0.04 | 0.07 |
| 2 | 0.15 | 0.3 | 0.45 | 0.52 | 0.6 | 0.05 | 0.12 | 0.3 | 0.55 | 0.7 | 0.02 | 0.05 | 0.08 | 0.12 | 0.15 |
| 3 | 0.3 | 0.5 | 0.6 | 0.65 | 0.75 | 0.08 | 0.15 | 0.45 | 0.6 | 0.8 | 0.15 | 0.3 | 0.45 | 0.55 | 0.65 |
| Scenario 10 | Scenario 11 | Scenario 12 | |||||||||||||
| 1 | 0.05 | 0.1 | 0.15 | 0.3 | 0.45 | 0.08 | 0.14 | 0.19 | 0.3 | 0.05 | 0.1 | 0.2 | 0.3 | ||
| 2 | 0.45 | 0.5 | 0.6 | 0.65 | 0.7 | 0.1 | 0.2 | 0.3 | 0.55 | 0.08 | 0.3 | 0.45 | 0.5 | ||
| 3 | 0.7 | 0.75 | 0.8 | 0.85 | 0.9 | 0.15 | 0.3 | 0.52 | 0.6 | 0.15 | 0.35 | 0.5 | 0.55 | ||
| 4 | 0.3 | 0.5 | 0.6 | 0.7 | 0.3 | 0.5 | 0.6 | 0.7 | |||||||
| Scenario 13 | Scenario 14 | Scenario 15 | |||||||||||||
| 1 | 0.05 | 0.08 | 0.1 | 0.3 | 0.01 | 0.05 | 0.1 | 0.3 | 0.01 | 0.1 | 0.15 | 0.45 | |||
| 2 | 0.08 | 0.1 | 0.2 | 0.35 | 0.05 | 0.1 | 0.45 | 0.5 | 0.03 | 0.3 | 0.4 | 0.5 | |||
| 3 | 0.1 | 0.2 | 0.3 | 0.4 | 0.1 | 0.45 | 0.5 | 0.6 | 0.05 | 0.5 | 0.55 | 0.65 | |||
| 4 | 0.3 | 0.35 | 0.4 | 0.6 | 0.3 | 0.5 | 0.6 | 0.65 | 0.08 | 0.55 | 0.6 | 0.75 | |||
3.1. Simulation scenarios
A total of 15 scenarios are displayed in Table 2. In the first 10 scenarios, agent A has 5 dose levels and agent B has 3. In scenarios 11 to 15, both agents have 4 dose levels. Target toxicity rate 0.3 is bolded. Among the first ten matrices: scenario 1 contains multiple MTD locations that are in the middle of matrix and diagonally connected; scenarios 2 and 4 represent over-toxic situations while scenario 4 is more extreme; scenarios 3 and 5 represent over-conservative situations while scenario 5 is more extreme; scenarios 6 and 7 contain multiple MTD locations but those locations are more scattered; scenarios 8, 9, and 10 contain single MTD at different locations. Among the last five square matrices: scenario 11 contains multiple MTD locations that are in the middle of matrix and diagonally connected; scenarios 12 and 13 contain multiple but more scattered MTD locations; scenario 14 contains two MTD locations that are at the bottom left and top right; scenario 15 has single MTD location.
3.2. Evaluation metrics
Four evaluation metrics are used: (1) correct MTD selection , defined as proportion of simulation runs that correctly identified the MTD among all 2000 simulations; (2) over-toxic MTD selection , defined as the proportion of simulation runs that identified over-toxic doses as MTD among all 2000 simulations; (3) correct patient assignment , defined as the average proportion of patients that were assigned to the MTD during the trial across all 2000 simulations; (4) over-toxic patient assignment , defined as the average proportion of patients that were assigned to over-toxic doses during the trial across all 2000 simulations. Metrics and will be used to evaluate the performance of designs in terms of MTD selection. The larger the is, the more accurate the design is in selecting the correct MTD. The larger the is, the more aggressive the design is in selecting MTD. Metrics and will be used to evaluate the characteristics of designs during trial conduct. The larger the is, the more accurate patient assignment is during the trial. The larger the is, the more aggressive the design is in dose escalation during the trial. Ideally, a design should show relatively large and but small and .
3.3. Design specifications
For I2D, we implemented published R codes [7]. We set cohort size of start-up phase to be 1 based on suggestions from simulation studies when target toxicity rate is 0.3 [11,36], and interaction to be 0 so that it is consistent with the paper's focus. The prior of parameters is the product of two independent exponential distributions with mean 1, which is the same as the one used in the I2D study [36].
For Copula, the website (http://www.blackwellpublishing.com/rss) where simulation programs were originally published is not accessible now, so we used the executable file on the website https://odin.mdacc.tmc.edu/yyuan/index_code.html. The executable file used escalation and de-escalation probability boundaries fixed at 0.8 and 0.45, respectively.
Hierarchy was implemented via R codes from website http://www-personal.umich.edu/tombraun/software.html. We set to be 10 based on the suggestion in the paper [3]. Together with other recommendations from the authors, we could obtain priors for all involved parameters. Details are described in Section 2.5.
POCRM was implemented via R package pocrm. We utilized six possible partial ordering as suggested [32,35]: across rows, across columns, up diagonals, down diagonals, up-down diagonals, and down-up diagonals. As we do not have information about which partial ordering is more likely, the prior probabilities of all 6 partial ordering were set to be equal. The skeleton required by the program was obtained using getprior function in package dfcrm from algorithm of Lee and Cheung [15] as suggested [35]. For the start-up phase, we used the ‘zoning’ method as suggested [33,34].
Design cBOIN was implemented via R package BOIN. For cBOIN, the interval boundaries were set to be 0.18 and 0.42 as suggested [18].
Design cKeyboard was implemented via R package Keyboard.
DFCOMB was implemented via R package dfcomb. As recommended by the authors, a vague normal prior for and , an informative prior exp for and [28]. For DFCOMB, we set the target toxicity boundaries as 0.18 and 0.42 to be consistent with cBOIN. As one of our reviewers suggested, we also tried setting these toxicity boundaries as 0.25 and 0.35.
Design gCRM was implemented via R codes from website http://www-personal.umich.edu/tombraun/software.html. As the authors suggested, we used a Gamma prior with mean 1 and variance 1 for β, normal prior with mean -8 and variance 1 for , normal prior with mean and variance 2 for , where [2].
Design bCRM was implemented via R codes from the authors. Its skeleton setting is the same as POCRM.
For most model-based designs, there are several design parameters involved. Some of these parameters have recommended specifications provided by the authors, for example, interval boundaries in design cBOIN are recommended to set as and where ϕ is the target toxicity probability [18]. However, some design parameters lack authors' suggested specifications and their influences to design performances are not clear. Therefore, we list such design parameters and corresponding designs in Table 1 where column ‘Main setting’ and column ‘Alternative setting’ contain parameter specifications used in our simulation studies.
Table 1.
Explored design parameters.
| Main Setting | Alternative Setting | ||||
|---|---|---|---|---|---|
| Design | Parameter | ||||
| I2D Copula DFCOMB | and | : 0.1, 0.2, 0.25, 0.3, 0.35 | : 0.1, 0.2, 0.25, 0.3 | : 0.05, 0.1, 0.2, 0.25, 0.3 | : 0.05, 0.1, 0.2, 0.22 |
| : 0.1, 0.3, 0.35 | : 0.1, 0.2, 0.25, 0.3 | : 0.1, 0.2, 0.25 | : 0.05, 0.1, 0.2, 0.22 | ||
| POCRM bCRM | Skeleton setting | half width: 0.05 | half width: 0.05 | half width: 0.03 | half width: 0.03 |
| MTD position: 11 | MTD position: 12 | MTD position: 13 | MTD position: 15 | ||
| DFCOMB bCRM | Escalation/de-escalation probability cutoff | 0.85 and 0.45 | 0.6 and 0.6 | ||
| Hierarchy gCRM | and | truth of each scenario | incorrect guess | ||
For all designs and scenarios, the maximum sample size was set to be 60. This number is widely used in other studies [2,17,18,24,26,39]. To verify that our conclusions are still valid with a different sample size, we repeated all simulations with a maximum sample size of 30.
4. Results
4.1. When maximum sample size is 60
Tables 3– 6 display design performances of , , , and , respectively. In these tables, I2D with parameter and specification in column ‘Main setting’ of Table 1 is denoted as ‘I2D’; I2D with parameter and specification in column ‘Alternative setting’ is denoted as ‘I2D.pq’. Copula with parameter and specification in column ‘Main setting’ is denoted as ‘Copula’; Copula with parameter and specification in column ‘Alternative setting’ is denoted as ‘Copula.pq’. Hierarchy with and specification in column ‘Main setting’ is denoted as ‘Hierarchy’; Hierarchy with and specification in column ‘Alternative setting’ is denoted as ‘Hierarchy.pi’. POCRM with skeleton specification in column ‘Main setting’ is denoted as ‘POCRM’; POCRM with skeleton specification in column ‘Alternative setting’ is denoted as ‘POCRM.skeleton’. DFCOMB with and , and escalation/de-escalation probability cutoff specifications in column ‘Main setting’ is denoted as ‘DFCOMB’; DFCOMB with parameter and specification in column ‘Alternative setting’ is denoted as ‘DFCOMB.pq’; DFCOMB with escalation/de-escalation probability cutoff specification in column ‘Alternative setting’ is denoted as ‘DFCOMB.cut’; DFCOMB with and , and escalation/de-escalation probability cutoff specifications in column ‘Main setting’, but with toxicity boundaries being 0.25 and 0.35 as suggested by one of our reviewers, is denoted as ‘DFCOMB.sensitive’. Design gCRM with and specification in column ‘Main setting’ is denoted as ‘gCRM’; gCRM with and specification in column ‘Alternative setting’ is denoted as ‘gCRM.pi’. Design bCRM with skeleton and escalation/de-escalation probability cutoff specifications in column ‘Main setting’ is denoted as ‘bCRM’; bCRM with skeleton specification in column ‘Alternative setting’ is denoted as ‘bCRM.skeleton’; bCRM with escalation/de-escalation probability cutoff specification in column ‘Alternative setting’ is denoted as ‘bCRM.cut’. We marked designs metrics that are ‘outstandingly’ poor as red, and those that are poor, but not ‘outstanding’ from the others as magenta.
Table 4.
Performance of MTD selection of designs across scenarios when maximum sample size is 60.
| Simulation scenario | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
| Design | Selection of over-toxic MTD | ||||||||||||||
| I2D | 0.18 | 0.18 | 0.18 | 0.10 | 0 | 0.29 | 0.07 | 0.26 | 0.39 | 0.32 | 0.12 | 0.18 | 0.43 | 0.31 | 0.28 |
| I2D.pq | 0.19 | 0.19 | 0.18 | 0.19 | 0 | 0.25 | 0.17 | 0.30 | 0.42 | 0.41 | 0.19 | 0.32 | 0.43 | 0.35 | 0.45 |
| Copula | 0.32 | 0.19 | 0.36 | 0.09 | 0 | 0.43 | 0.22 | 0.58 | 0.55 | 0.63 | 0.19 | 0.43 | 0.66 | 0.69 | 0.50 |
| Copula.pq | 0.30 | 0.16 | 0.33 | 0.09 | 0 | 0.46 | 0.33 | 0.44 | 0.53 | 0.78 | 0.22 | 0.51 | 0.41 | 0.62 | 0.54 |
| Hierarchy | 0.22 | 0.18 | 0.21 | 0.16 | 0 | 0.30 | 0.31 | 0.30 | 0.17 | 0.20 | 0.13 | 0.37 | 0.39 | 0.41 | 0.22 |
| Hierarchy.pi | 0.22 | 0.23 | 0.14 | 0.24 | 0 | 0.29 | 0.24 | 0.27 | 0.40 | 0.39 | 0.13 | 0.43 | 0.44 | 0.54 | 0.47 |
| POCRM | 0.12 | 0.24 | 0.08 | 0.22 | 0 | 0.11 | 0.19 | 0.17 | 0.18 | 0.26 | 0.04 | 0.30 | 0.28 | 0.32 | 0.31 |
| POCRM.skeleton | 0.15 | 0.24 | 0.12 | 0.20 | 0 | 0.17 | 0.21 | 0.18 | 0.23 | 0.27 | 0.06 | 0.34 | 0.32 | 0.34 | 0.39 |
| DFCOMB | 0.18 | 0.08 | 0.17 | 0.06 | 0 | 0.27 | 0.09 | 0.27 | 0.57 | 0.32 | 0.08 | 0.19 | 0.25 | 0.61 | 0.21 |
| DFCOMB.pq | 0.10 | 0.06 | 0.14 | 0.08 | 0 | 0.31 | 0.13 | 0.37 | 0.62 | 0.52 | 0.08 | 0.30 | 0.28 | 0.51 | 0.45 |
| DFCOMB.cut | 0.08 | 0.07 | 0.11 | 0.08 | 0 | 0.07 | 0.05 | 0.19 | 0.37 | 0.20 | 0.02 | 0.12 | 0.16 | 0.53 | 0.23 |
| DFCOMB.sensitive | 0.25 | 0.10 | 0.21 | 0.09 | 0 | 0.36 | 0.13 | 0.36 | 0.65 | 0.37 | 0.12 | 0.25 | 0.32 | 0.66 | 0.27 |
| gCRM | 0.15 | 0.13 | 0.13 | 0.10 | 0 | 0.18 | 0.10 | 0.31 | 0.11 | 0.33 | 0.06 | 0.32 | 0.31 | 0.52 | 0.29 |
| gCRM.pi | 0.18 | 0.14 | 0.17 | 0.10 | 0 | 0.19 | 0.08 | 0.29 | 0.13 | 0.26 | 0.07 | 0.22 | 0.38 | 0.45 | 0.44 |
| cBOIN | 0.16 | 0.21 | 0.15 | 0.17 | 0 | 0.19 | 0.13 | 0.21 | 0.13 | 0.31 | 0.08 | 0.29 | 0.43 | 0.34 | 0.29 |
| cKeyboard | 0.17 | 0.21 | 0.14 | 0.17 | 0 | 0.20 | 0.14 | 0.21 | 0.12 | 0.31 | 0.09 | 0.27 | 0.43 | 0.34 | 0.30 |
| bCRM | 0.08 | 0.22 | 0.05 | 0.24 | 0 | 0.11 | 0.15 | 0.20 | 0.20 | 0.38 | 0.03 | 0.27 | 0.19 | 0.53 | 0.42 |
| bCRM.skeleton | 0.12 | 0.22 | 0.08 | 0.16 | 0 | 0.17 | 0.21 | 0.29 | 0.43 | 0.50 | 0.05 | 0.33 | 0.21 | 0.58 | 0.46 |
| bCRM.cut | 0.12 | 0.30 | 0.07 | 0.29 | 0 | 0.12 | 0.19 | 0.19 | 0.17 | 0.36 | 0.04 | 0.32 | 0.24 | 0.53 | 0.44 |
Notes: I2D: design I2D with parameter and specified in ‘Main setting’ of Table 1.
I2D.pq: design I2D with parameter and specified in ‘Alternative setting’.
Copula: design Copula with parameter and specified in ‘Main setting’.
Copula.pq: design Copula with parameter and specified in ‘Alternative setting’.
Hierarchy: design Hierarchy with and specified in ‘Main setting’.
Hierarchy.pi: design Hierarchy with and specified in ‘Alternative setting’.
POCRM: design POCRM with skeleton specified in ‘Main setting’.
POCRM.skeleton: design POCRM with skeleton specified in ‘Alternative setting’.
DFCOMB: design DFCOMB with and , and escalation/de-escalation probability cutoff specified in ‘Main setting’.
DFCOMB.pq: design DFCOMB with and specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
DFCOMB.cut: design DFCOMB with and specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.
DFCOMB.sensitive: design DFCOMB with and , and escalation/de-escalation probability cutoff specified in ‘Main setting’, but target toxicity interval boundaries suggested by one of our reviewers.
gCRM: design gCRM with and specified in ‘Main setting’.
gCRM.pi: design gCRM with and specified in ‘Alternative setting’.
bCRM: design bCRM with skeleton, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
bCRM.skeleton: design bCRM with skeleton specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
bCRM.cut: design bCRM with skeleton specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.
Table 5.
Performance of patient assignment of designs across scenarios when maximum sample size is 60.
| Simulation scenario | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
| Design | Patient receiving correct MTD during trials | ||||||||||||||
| I2D | 0.38 | 0.57 | 0.45 | 0.71 | 0.58 | 0.23 | 0.50 | 0.07 | 0.05 | 0.33 | 0.44 | 0.39 | 0.25 | 0.27 | 0.18 |
| I2D.pq | 0.45 | 0.56 | 0.44 | 0.55 | 0.61 | 0.26 | 0.42 | 0.15 | 0.03 | 0.28 | 0.50 | 0.34 | 0.27 | 0.28 | 0.11 |
| Copula | 0.33 | 0.57 | 0.31 | 0.72 | 0.46 | 0.29 | 0.41 | 0.12 | 0.31 | 0.09 | 0.32 | 0.33 | 0.08 | 0.07 | 0.34 |
| Copula.pq | 0.41 | 0.55 | 0.37 | 0.63 | 0.49 | 0.27 | 0.43 | 0.26 | 0.18 | 0.00 | 0.40 | 0.23 | 0.27 | 0.07 | 0.12 |
| Hierarchy | 0.42 | 0.50 | 0.43 | 0.68 | 0.68 | 0.31 | 0.35 | 0.22 | 0.30 | 0.35 | 0.43 | 0.27 | 0.38 | 0.15 | 0.39 |
| Hierarchy.pi | 0.42 | 0.40 | 0.52 | 0.49 | 0.65 | 0.30 | 0.41 | 0.26 | 0.07 | 0.04 | 0.43 | 0.23 | 0.27 | 0.10 | 0.16 |
| POCRM | 0.53 | 0.53 | 0.47 | 0.64 | 0.33 | 0.37 | 0.43 | 0.35 | 0.30 | 0.36 | 0.57 | 0.39 | 0.33 | 0.36 | 0.33 |
| POCRM.skeleton | 0.52 | 0.48 | 0.48 | 0.68 | 0.47 | 0.36 | 0.38 | 0.35 | 0.29 | 0.33 | 0.56 | 0.38 | 0.34 | 0.35 | 0.20 |
| DFCOMB | 0.23 | 0.45 | 0.37 | 0.91 | 0.39 | 0.16 | 0.33 | 0.23 | 0.13 | 0.16 | 0.20 | 0.28 | 0.12 | 0.11 | 0.35 |
| DFCOMB.pq | 0.34 | 0.48 | 0.37 | 0.91 | 0.37 | 0.17 | 0.33 | 0.20 | 0.06 | 0.04 | 0.28 | 0.29 | 0.19 | 0.23 | 0.28 |
| DFCOMB.cut | 0.34 | 0.46 | 0.39 | 0.77 | 0.51 | 0.23 | 0.41 | 0.17 | 0.17 | 0.28 | 0.39 | 0.33 | 0.21 | 0.11 | 0.26 |
| DFCOMB.sensitive | 0.23 | 0.45 | 0.37 | 0.91 | 0.39 | 0.16 | 0.33 | 0.23 | 0.13 | 0.16 | 0.20 | 0.28 | 0.12 | 0.11 | 0.35 |
| gCRM.b | 0.46 | 0.45 | 0.49 | 0.75 | 0.69 | 0.36 | 0.46 | 0.27 | 0.29 | 0.36 | 0.44 | 0.34 | 0.34 | 0.14 | 0.33 |
| gCRM.pi | 0.37 | 0.44 | 0.41 | 0.75 | 0.68 | 0.32 | 0.54 | 0.16 | 0.25 | 0.31 | 0.43 | 0.40 | 0.25 | 0.18 | 0.18 |
| cBOIN | 0.43 | 0.49 | 0.40 | 0.72 | 0.43 | 0.34 | 0.46 | 0.21 | 0.26 | 0.20 | 0.44 | 0.37 | 0.23 | 0.21 | 0.25 |
| cKeyboard | 0.42 | 0.49 | 0.40 | 0.72 | 0.43 | 0.33 | 0.44 | 0.21 | 0.25 | 0.20 | 0.43 | 0.37 | 0.23 | 0.22 | 0.24 |
| bCRM | 0.43 | 0.52 | 0.37 | 0.70 | 0.24 | 0.26 | 0.35 | 0.24 | 0.18 | 0.24 | 0.36 | 0.37 | 0.26 | 0.21 | 0.30 |
| bCRM.skeleton | 0.46 | 0.49 | 0.44 | 0.77 | 0.34 | 0.28 | 0.37 | 0.25 | 0.16 | 0.19 | 0.42 | 0.37 | 0.31 | 0.20 | 0.28 |
| bCRM.cut | 0.47 | 0.44 | 0.47 | 0.52 | 0.36 | 0.30 | 0.35 | 0.26 | 0.15 | 0.25 | 0.47 | 0.35 | 0.29 | 0.21 | 0.24 |
Notes: I2D: design I2D with parameter and specified in ‘Main setting’ of Table 1.
I2D.pq: design I2D with parameter and specified in ‘Alternative setting’.
Copula: design Copula with parameter and specified in ‘Main setting’.
Copula.pq: design Copula with parameter and specified in ‘Alternative setting’.
Hierarchy: design Hierarchy with and specified in ‘Main setting’.
Hierarchy.pi: design Hierarchy with and specified in ‘Alternative setting’.
POCRM: design POCRM with skeleton specified in ‘Main setting’.
POCRM.skeleton: design POCRM with skeleton specified in ‘Alternative setting’.
DFCOMB: design DFCOMB with and , and escalation/de-escalation probability cutoff specified in ‘Main setting’.
DFCOMB.pq: design DFCOMB with and specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
DFCOMB.cut: design DFCOMB with and specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.
DFCOMB.sensitive: design DFCOMB with and , and escalation/de-escalation probability cutoff specified in ‘Main setting’, but target toxicity interval boundaries suggested by one of our reviewers.
gCRM: design gCRM with and specified in ‘Main setting’.
gCRM.pi: design gCRM with and specified in ‘Alternative setting’.
bCRM: design bCRM with skeleton, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
bCRM.skeleton: design bCRM with skeleton specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
bCRM.cut: design bCRM with skeleton specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.
Table 3.
Performance of MTD selection of designs across scenarios when maximum sample size is 60.
| Simulation scenario | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
| Design | Selection of correct MTD | ||||||||||||||
| I2D | 0.43 | 0.71 | 0.41 | 0.9 | 0.86 | 0.34 | 0.59 | 0.11 | 0.15 | 0.43 | 0.47 | 0.47 | 0.23 | 0.25 | 0.24 |
| I2D.pq | 0.58 | 0.73 | 0.54 | 0.81 | 0.88 | 0.35 | 0.54 | 0.2 | 0.09 | 0.3 | 0.58 | 0.45 | 0.28 | 0.4 | 0.16 |
| Copula | 0.53 | 0.6 | 0.52 | 0.1 | 0.9 | 0.44 | 0.65 | 0.15 | 0.32 | 0.26 | 0.56 | 0.45 | 0.12 | 0.12 | 0.42 |
| Copula.pq | 0.59 | 0.6 | 0.58 | 0.06 | 0.91 | 0.37 | 0.55 | 0.4 | 0.12 | 0.01 | 0.53 | 0.26 | 0.4 | 0.05 | 0.13 |
| Hierarchy | 0.61 | 0.64 | 0.62 | 0.61 | 0.81 | 0.45 | 0.45 | 0.3 | 0.47 | 0.52 | 0.6 | 0.28 | 0.47 | 0.24 | 0.52 |
| Hierarchy.pi | 0.61 | 0.5 | 0.71 | 0.32 | 0.78 | 0.42 | 0.52 | 0.42 | 0.09 | 0.11 | 0.6 | 0.24 | 0.39 | 0.11 | 0.17 |
| POCRM | 0.75 | 0.71 | 0.69 | 0.78 | 0.54 | 0.59 | 0.56 | 0.59 | 0.52 | 0.58 | 0.74 | 0.52 | 0.46 | 0.57 | 0.48 |
| POCRM.skeleton | 0.74 | 0.67 | 0.71 | 0.8 | 0.73 | 0.57 | 0.51 | 0.6 | 0.49 | 0.53 | 0.77 | 0.5 | 0.44 | 0.54 | 0.35 |
| DFCOMB | 0.54 | 0.76 | 0.66 | 0.65 | 0.54 | 0.33 | 0.69 | 0.48 | 0.15 | 0.47 | 0.44 | 0.63 | 0.37 | 0.18 | 0.67 |
| DFCOMB.pq | 0.69 | 0.8 | 0.65 | 0.64 | 0.52 | 0.34 | 0.71 | 0.3 | 0.09 | 0.14 | 0.57 | 0.56 | 0.33 | 0.35 | 0.45 |
| DFCOMB.cut | 0.59 | 0.79 | 0.64 | 0.65 | 0.54 | 0.26 | 0.70 | 0.36 | 0.24 | 0.57 | 0.51 | 0.63 | 0.36 | 0.15 | 0.60 |
| DFCOMB.sensitive | 0.54 | 0.78 | 0.66 | 0.62 | 0.58 | 0.33 | 0.69 | 0.44 | 0.15 | 0.47 | 0.46 | 0.61 | 0.38 | 0.20 | 0.64 |
| gCRM | 0.69 | 0.65 | 0.71 | 0.42 | 0.81 | 0.59 | 0.67 | 0.34 | 0.47 | 0.55 | 0.64 | 0.47 | 0.49 | 0.17 | 0.48 |
| gCRM.pi | 0.58 | 0.64 | 0.61 | 0.47 | 0.81 | 0.5 | 0.74 | 0.26 | 0.48 | 0.51 | 0.61 | 0.57 | 0.34 | 0.22 | 0.23 |
| cBOIN | 0.7 | 0.69 | 0.7 | 0.62 | 0.72 | 0.58 | 0.74 | 0.38 | 0.4 | 0.45 | 0.75 | 0.57 | 0.38 | 0.4 | 0.37 |
| cKeyboard | 0.67 | 0.7 | 0.7 | 0.6 | 0.72 | 0.56 | 0.71 | 0.38 | 0.4 | 0.45 | 0.73 | 0.58 | 0.38 | 0.43 | 0.36 |
| bCRM | 0.72 | 0.75 | 0.66 | 0.76 | 0.52 | 0.51 | 0.63 | 0.51 | 0.37 | 0.5 | 0.62 | 0.54 | 0.39 | 0.35 | 0.47 |
| bCRM.skeleton | 0.75 | 0.72 | 0.73 | 0.84 | 0.69 | 0.59 | 0.64 | 0.51 | 0.36 | 0.36 | 0.69 | 0.52 | 0.47 | 0.33 | 0.45 |
| bCRM.cut | 0.75 | 0.67 | 0.71 | 0.71 | 0.59 | 0.56 | 0.63 | 0.54 | 0.40 | 0.52 | 0.71 | 0.53 | 0.40 | 0.37 | 0.43 |
Notes: I2D: design I2D with parameter and specified in ‘Main setting’ of Table 1.
I2D.pq: design I2D with parameter and specified in ‘Alternative setting’.
Copula: design Copula with parameter and specified in ‘Main setting’.
Copula.pq: design Copula with parameter and specified in ‘Alternative setting’.
Hierarchy: design Hierarchy with and specified in ‘Main setting’.
Hierarchy.pi: design Hierarchy with and specified in ‘Alternative setting’.
POCRM: design POCRM with skeleton specified in ‘Main setting’.
POCRM.skeleton: design POCRM with skeleton specified in ‘Alternative setting’.
DFCOMB: design DFCOMB with and , and escalation/de-escalation probability cutoff specified in ‘Main setting’.
DFCOMB.pq: design DFCOMB with and specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
DFCOMB.cut: design DFCOMB with and specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.
DFCOMB.sensitive: design DFCOMB with and , and escalation/de-escalation probability cutoff specified in ‘Main setting’, but target toxicity interval boundaries suggested by one of our reviewers.
gCRM: design gCRM with and specified in ‘Main setting’.
gCRM.pi: design gCRM with and specified in ‘Alternative setting’.
bCRM: design bCRM with skeleton, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
bCRM.skeleton: design bCRM with skeleton specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
bCRM.cut: design bCRM with skeleton specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.
Table 6.
Performance of patient assignment of designs across scenarios when maximum sample size is 60.
| Simulation scenario | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
| Design | Patient receiving over-toxic doses during trials | ||||||||||||||
| I2D | 0.29 | 0.27 | 0.19 | 0.29 | 0 | 0.26 | 0.16 | 0.37 | 0.35 | 0.31 | 0.15 | 0.28 | 0.36 | 0.38 | 0.42 |
| I2D.pq | 0.26 | 0.35 | 0.22 | 0.45 | 0 | 0.30 | 0.25 | 0.33 | 0.36 | 0.34 | 0.20 | 0.38 | 0.40 | 0.44 | 0.55 |
| Copula | 0.17 | 0.13 | 0.19 | 0.28 | 0 | 0.18 | 0.13 | 0.31 | 0.23 | 0.41 | 0.13 | 0.28 | 0.39 | 0.44 | 0.30 |
| Copula.pq | 0.17 | 0.20 | 0.16 | 0.37 | 0 | 0.18 | 0.17 | 0.26 | 0.23 | 0.45 | 0.14 | 0.36 | 0.27 | 0.44 | 0.41 |
| Hierarchy | 0.35 | 0.35 | 0.31 | 0.32 | 0 | 0.36 | 0.36 | 0.43 | 0.30 | 0.44 | 0.24 | 0.44 | 0.44 | 0.60 | 0.38 |
| Hierarchy.pi | 0.34 | 0.46 | 0.23 | 0.51 | 0 | 0.33 | 0.33 | 0.38 | 0.38 | 0.52 | 0.24 | 0.48 | 0.47 | 0.58 | 0.51 |
| POCRM | 0.16 | 0.32 | 0.11 | 0.36 | 0 | 0.15 | 0.24 | 0.24 | 0.21 | 0.38 | 0.10 | 0.33 | 0.25 | 0.36 | 0.34 |
| POCRM.skeleton | 0.20 | 0.37 | 0.16 | 0.29 | 0 | 0.21 | 0.27 | 0.29 | 0.26 | 0.40 | 0.12 | 0.34 | 0.29 | 0.38 | 0.45 |
| DFCOMB | 0.11 | 0.05 | 0.14 | 0.09 | 0 | 0.14 | 0.08 | 0.23 | 0.25 | 0.23 | 0.06 | 0.15 | 0.21 | 0.45 | 0.26 |
| DFCOMB.pq | 0.09 | 0.07 | 0.14 | 0.09 | 0 | 0.17 | 0.13 | 0.29 | 0.28 | 0.33 | 0.06 | 0.18 | 0.19 | 0.36 | 0.30 |
| DFCOMB.cut | 0.21 | 0.17 | 0.24 | 0.23 | 0 | 0.22 | 0.19 | 0.35 | 0.42 | 0.30 | 0.10 | 0.24 | 0.28 | 0.55 | 0.44 |
| DFCOMB.sensitive | 0.11 | 0.05 | 0.14 | 0.09 | 0 | 0.14 | 0.08 | 0.23 | 0.25 | 0.23 | 0.06 | 0.15 | 0.21 | 0.45 | 0.26 |
| gCRM.b | 0.27 | 0.27 | 0.24 | 0.25 | 0 | 0.29 | 0.21 | 0.34 | 0.23 | 0.42 | 0.16 | 0.37 | 0.38 | 0.54 | 0.35 |
| gCRM.pi | 0.31 | 0.28 | 0.27 | 0.25 | 0 | 0.28 | 0.17 | 0.36 | 0.23 | 0.37 | 0.17 | 0.31 | 0.41 | 0.47 | 0.46 |
| cBOIN | 0.20 | 0.27 | 0.17 | 0.28 | 0 | 0.22 | 0.20 | 0.27 | 0.21 | 0.38 | 0.15 | 0.28 | 0.33 | 0.37 | 0.32 |
| cKeyboard | 0.20 | 0.27 | 0.17 | 0.28 | 0 | 0.22 | 0.21 | 0.27 | 0.20 | 0.38 | 0.15 | 0.28 | 0.33 | 0.37 | 0.32 |
| bCRM | 0.11 | 0.27 | 0.06 | 0.30 | 0 | 0.11 | 0.17 | 0.23 | 0.16 | 0.33 | 0.07 | 0.21 | 0.15 | 0.40 | 0.33 |
| bCRM.skeleton | 0.15 | 0.26 | 0.08 | 0.23 | 0 | 0.14 | 0.19 | 0.27 | 0.22 | 0.34 | 0.08 | 0.24 | 0.18 | 0.43 | 0.36 |
| bCRM.cut | 0.23 | 0.45 | 0.12 | 0.48 | 0 | 0.19 | 0.30 | 0.34 | 0.28 | 0.45 | 0.13 | 0.35 | 0.23 | 0.50 | 0.47 |
Notes: I2D: design I2D with parameter and specified in ‘Main setting’ of Table 1.
I2D.pq: design I2D with parameter and specified in ‘Alternative setting’.
Copula: design Copula with parameter and specified in ‘Main setting’.
Copula.pq: design Copula with parameter and specified in ‘Alternative setting’.
Hierarchy: design Hierarchy with and specified in ‘Main setting’.
Hierarchy.pi: design Hierarchy with and specified in ‘Alternative setting’.
POCRM: design POCRM with skeleton specified in ‘Main setting’.
POCRM.skeleton: design POCRM with skeleton specified in ‘Alternative setting’.
DFCOMB: design DFCOMB with and , and escalation/de-escalation probability cutoff specified in ‘Main setting’.
DFCOMB.pq: design DFCOMB with and specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
DFCOMB.cut: design DFCOMB with and specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.
DFCOMB.sensitive: design DFCOMB with and , and escalation/de-escalation probability cutoff specified in ‘Main setting’, but target toxicity interval boundaries suggested by one of our reviewers.
gCRM: design gCRM with and specified in ‘Main setting’.
gCRM.pi: design gCRM with and specified in ‘Alternative setting’.
bCRM: design bCRM with skeleton, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
bCRM.skeleton: design bCRM with skeleton specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.
bCRM.cut: design bCRM with skeleton specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.
I2D shows unstable performances in MTD identification in most simulation scenarios. While under extreme conditions like scenario 4 and 5, its is among the best, under scenarios like scenarios 1, 3, 8, 9, and 15, its is among the worst. With alternative toxicity profile of individual agents ( and ), some scenarios had improved performances while some had worse, but its overall characteristics remain the same. Part of the reason of the unstable performance is that, I2D always starts its model-based part with agent B's lowest dose no matter what happens in the start-up phase. This way, the starting dose combination of model-based part could be very far from the location of true MTDs, which makes I2D harder to identify them. Overall I2D is not an aggressive design as its and are relatively small under most scenarios.
Copula performed poorly in MTD identification as its and are among the worst in several scenarios. This indicates that Copula is quite aggressive as it is more likely to select higher dose combinations as the MTD. With alternative toxicity profile of individual agents ( and ), Copula's performances fluctuated in several scenarios, but overall poor performances and aggressiveness were still observed. One potential reason of the unsatisfactory performances is the limited flexibility on changing parameters like escalation and de-escalation probability cutoffs in the executable file. Therefore, it is reasonable to argue that the default values (0.8 and 0.45) are not optimal for some scenarios. On the other hand, as we do not know what the true dose toxicity matrix looks like in real life, obtaining a uniform parameter set to achieve best performances under all scenarios through simulation calibration is not feasible.
Hierarchy is quite aggressive overall in trial conduct as it has the worst under several scenarios. We can observe that incorrect and performed much worse than using correct ones in most scenarios. A possible reason for the aggressiveness is that, unlike most other designs, Hierarchy allows simultaneous dose escalation of both agents during trial conduct. Another aspect we should emphasize is that despite a high proportion of patients assigned to over-toxic doses, Hierarchy did not outperform other designs in terms of . Some features of Hierarchy like omitting the interaction effect between agents and no start-up phase may contribute to this poor performance as well.
POCRM shows satisfactory characteristics across all scenarios except low in scenario 5. Then we found that the alternative skeleton setting in Table 1 improved from 0.54 to 0.73 in scenario 5. However, the alternative skeleton setting led to much worse performance metrics in scenario 15. Another finding is that POCRM performed well under scenarios (e.g. scenario 9) when the underlying true toxicity orderings of dose combinations are not among any of the six orderings we used. Such results ‘validate’ the idea of POCRM: in practice we do not need to specify the correct toxicity ordering in POCRM, providing orderings close to the correct one is sufficient.
DFCOMB performed poorly in terms of under several scenarios. But overall DFCOMB is not an aggressive design as its and are relatively small under most scenarios. With alternative escalation and de-escalation probability cutoffs, some scenarios had improved (e.g. scenarios 9 and 10) while some had worse (e.g. scenarios 6 and 8). In addition, we observed worse and better with alternative specification (0.6 and 0.6 as escalation and de-escalation probability cutoffs). Worse is expected as the alternative cutoff pairs make DFCOMB easier for dose escalation but more difficult for de-escalation. With more patients assigned to over-toxic doses, the estimation of dose toxicity probabilities were more accurate, which leads to better . With alternative toxicity profile of individual agents ( and ), a few scenarios showed obvious impact: in scenario 1 was improved from 0.54 to 0.69, in scenario 11 was improved from 0.44 to 0.57, in scenario 14 was improved from 0.18 to 0.35, but in scenario 10 and 15 were dramatically reduced from 0.47 to 0.14, and from 0.67 to 0.45, respectively. With the ‘sensitivity run’ of target toxicity boundaries suggested by one of our reviewers, we observed slightly worse in some scenarios. Such results imply that the optimal design parameters are scenario-dependent. Therefore, it is not feasible for us to calibrate the parameters through simulations in real life clinical trials.
Design gCRM performed well in most scenarios. Comparing results using correct and with incorrect ones, we observed that using incorrect inputs leads to worse operating characteristics under some scenarios, and similar performances under the other ones. Interestingly, although gCRM allows simultaneous dose escalation of both agents, it did not show much aggressiveness as its and are not among the largest under most scenarios. One possible reason could be that gCRM uses single patient as unit, instead of cohorts during trial conduct. Therefore, every time when an over-toxic dose combination is assigned, only one patient instead of a cohort of several patients will receive it. From this perspective, gCRM could be viewed as more ‘flexible’ in the dose-finding process and such flexibility may dilute the aggressiveness.
Design bCRM is another one whose performances were unstable across different scenarios. Its in scenario 5 and in scenario 4 are among the worst. Metrics in other scenarios are acceptable. Similar to DFCOMB, we observed worse with alternative escalation and de-escalation probability cutoffs. However, was not improved. Alternative skeleton setting has influences to bCRM as well as it improved the in scenario 5 and in scenario 4, but worsened in scenario 10 and in scenario 9 and 10. Therefore, unstable performances remains an issue even using alternative skeleton setting or escalation and de-escalation probability cutoffs.
Both designs cBOIN and cKeyboard performed well across all scenarios. They may not always be the top performers, but their operating characteristics are never among the worst. This is especially important in real life clinical trials, as we do not know which scenario could be the truth. Therefore, cBOIN and cKeyboard are able to guarantee satisfactory performances in practice.
4.2. When maximum sample size is 30
Results using 30 as the maximum sample size are shown in Supplementary Materials Table 5 to Table 12. All the findings discussed above are observed when maximum sample size is reduced from 60 to 30. Additionally, we found that in Supplementary Materials Table 5, POCRM with alternative skeleton had unsatisfactory in scenario 15 given maximum sample size of 30. This indicates that parameter calibration is not feasible for POCRM as well.
5. Discussion
Despite recent advances in novel statistical designs for combinational agents, we found that they are seldomly cited and used in ongoing clinical trials. In dose-finding studies for ‘combinational agents’, the investigators often conducted dose-finding for one agent, while the second agent remained fixed.
Riviere et al. [27] reviewed 543 clinical trial papers published between 2011 and 2013 that investigated combinational agents. Among these papers, 162 had at least two agents dose-escalated and the rest (381) had only one agent dose-escalated with the others fixed. Only one out of 543 papers used a design that was ideal for combinational agents. On the website https://ClinicalTrials.gov/, we found 591 phase I/early phase I intervention studies in the U.S. for combinational agents with trial results, with primary completion dates after 1/1/2010; however, the nine designs we evaluate here were only cited in less than 5 trial papers. While these 591 trials include trials that have yet to be published, it would appear that optimal designs for combination therapies are underutilized. The discrepancy between low acceptance of novel designs in clinical practice and the endeavor of promoting better designs should be reconciled.
There are several barriers to implementing more optimal designs for clinical trials exploring combinational agents. First, there is no practical guidance in terms of design selection to the investigators. Second, model-based designs are not easily understood, and are relatively complicated in implementation as they usually require robust assumptions, parameter calibration, and ongoing statistical support to update toxicity probability estimation. In addition, the start-up phases of various model-based designs could be quite different from each other. Some designs' start-up phases could even largely influence their operating characteristics. Such complexity places another layer of barrier to the broader usage of model-based designs. Motivated by these existing hurdles, our simulation study aims to provide practical recommendations to investigators in designing phase I clinical trials and explore the impact of different design parameters in running model-based designs.
From our simulation results, we observed considerable performance fluctuations for several model-based designs in different scenarios. Such unstable performances may be due to assumptions of their specific parametric dose-toxicity relationships. When the assumed relationship fits the true scenario, those designs may result in favorable performances, and vice versa. Overall, designs POCRM, gCRM, cBOIN, and cKeyboard perform better than the others regarding our evaluation metrics and we recommend them in future combinational dose-finding studies.
From practical perspective, we would like to promote broader usage of cBOIN and cKeyboard for combination trials. The reasons are multifocal. First, cBOIN and cKeyboard have guaranteed stable operating characteristics in all scenarios. This feature is crucial in practice without knowing the truth. Second, cBOIN and cKeyboard are convenient as they require neither parameter calibration nor the agents' prior information. Finally, cBOIN and cKeyboard are much easier to implement as they are able to provide a dose escalation/de-escalation table before trial conduct (similar to the conventional 3+3 design). This feature is ideal for investigators who prefer the 3+3 design over model-based designs due to simplicity, even though 3+3 designs have lower accuracy in MTD identification and more exposure to patients to subtherapeutic doses [25,29]. Our findings are quite consistent with a recently published study from the ASA biopharmaceutical working group [21] whose aim is to evaluate the accuracy and safety among various Phase I designs for combinational agents. Under the situation of finding one MTD, the ASA working group paper also concludes that combinational BOIN is more attractive than algorithm-based and model-based designs in planning phase I clinical trials for combinational agents.
In addition to evaluating the nine designs, we explored the impact of four design parameters that are commonly encountered in model-based designs and do not have researchers' recommendations: monotherapy toxicity profiles, skeleton settings, dose escalation/de-escalation probability cutoffs, and prior guesses of dose combinations' toxicity probabilities. After re-running designs using alternative parameter settings, we found that dose escalation/de-escalation probability cutoffs have negligible impact on design operating characteristics in all scenarios. But all other parameters showed impacts on design performances. When monotherapy toxicity profiles are available, their impact on design performances is not a big concern. However, for dose combinations' toxicity probabilities, it is often difficult to provide accurate guesses. Similarly, it is almost impossible to calibrate the skeleton setting through simulation in practice without knowing the true toxicity profile of dose combinations.
Lastly, we repeated all simulations with a different maximum sample size and observed that almost all findings are consistent.
Our simulation study also has limitations. Some designs, such as gCRM, fix their cohort size to be one, making the performance comparison less fair with other designs that have the flexibility of changing cohort sizes.
Our hope is that this paper will contribute to appropriate and responsible study design utilization for phase I trials with combinational agents. Therefore, heightened awareness of these new designs can only deliver improved results.
Supplementary Material
Funding Statement
Dr. Sayour is supported by an NCI award R37CA251978.
Disclosure statement
No conflict of interest was reported by the authors.
References
- 1.Babb J., Rogatko A., and Zacks S., Cancer phase I clinical trials: Efficient dose escalation with overdose control, Stat. Med. 17 (1998), pp. 1103–1120. [DOI] [PubMed] [Google Scholar]
- 2.Braun T.M. and Jia N., A generalized continual reassessment method for two-agent phase I trials, Stat. Biopharm. Res. 5 (2013), pp. 105–115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Braun T.M. and Wang S., A hierarchical bayesian design for phase I trials of novel combinations of cancer therapeutic agents, Biometrics 66 (2010), pp. 805–812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Bril G., Dykstra R., Pillers C., and Robertson T., Algorithm as 206: Isotonic regression in two independent variables, J. R. Stat. Soc. Ser. C (Applied Statistics) 33 (1984), pp. 352–357. [Google Scholar]
- 5.Conaway M.R., Dunbar S., and Peddada S.D., Designs for single-or multiple-agent phase I trials, Biometrics 60 (2004), pp. 661–669. [DOI] [PubMed] [Google Scholar]
- 6.Diniz M.A., Kim S., and Tighiouart M., A Bayesian adaptive design in cancer phase I trials using dose combinations in the presence of a baseline covariate, J. Probab. Stat. 2018 (2018), pp. 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Ezzalfani M., How to design a dose-finding study on combined agents: Choice of design and development of r functions, PLoS. ONE. 14 (2019). Article ID e0224940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Harrington J.A., Wheeler G.M., Sweeting M.J., Mander A.P., and Jodrell D.I., Adaptive designs for dual-agent phase I dose-escalation studies, Nat. Rev. Clin. Oncol. 10 (2013), p. 277. [DOI] [PubMed] [Google Scholar]
- 9.Hirakawa A., Hamada C., and Matsui S., A dose-finding approach based on shrunken predictive probability for combinations of two agents in phase I trials, Stat. Med. 32 (2013), pp. 4515–4525. [DOI] [PubMed] [Google Scholar]
- 10.Hirakawa A., Wages N.A., Sato H., and Matsui S., A comparative study of adaptive dose-finding designs for phase I oncology trials of combination therapies, Stat. Med. 34 (2015), pp. 3194–3213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ivanova A., Montazer-Haghighi A., Mohanty S.G., and Durham S.D., Improved up-and-down designs for phase I trials, Stat. Med. 22 (2003), pp. 69–82. [DOI] [PubMed] [Google Scholar]
- 12.Ivanova A. and Kim S.H., Dose finding for continuous and ordinal outcomes with a monotone objective function: A unified approach, Biometrics 65 (2009), pp. 307–315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ivanova A. and Wang K., A non-parametric approach to the design and analysis of two-dimensional dose-finding trials, Stat. Med. 23 (2004), pp. 1861–1870. [DOI] [PubMed] [Google Scholar]
- 14.Korn E.L. and Simon R., Using the tolerable-dose diagram in the design of phase I combination chemotherapy trials., J. Clin. Oncol. 11 (1993), pp. 794–801. [DOI] [PubMed] [Google Scholar]
- 15.Lee S.M. and Cheung Y.K., Model calibration in the continual reassessment method, Clin. Trials. 6 (2009), pp. 227–238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lee B.L. and Fan S.K., A two-dimensional search algorithm for dose-finding trials of two agents, J. Biopharm. Stat. 22 (2012), pp. 802–818. [DOI] [PubMed] [Google Scholar]
- 17.Lin R. and Yin G., Bootstrap aggregating continual reassessment method for dose finding in drug-combination trials, Ann. Appl. Stat. 10 (2016), pp. 2349–2376. [Google Scholar]
- 18.Lin R. and Yin G., Bayesian optimal interval design for dose finding in drug-combination trials, Stat. Methods Med. Res. 26 (2017), pp. 2155–2167. [DOI] [PubMed] [Google Scholar]
- 19.Liu S. and Ning J., A bayesian dose-finding design for drug combination trials with delayed toxicities, Bayesian Anal. 8 (2013), p. 703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Liu S. and Yuan Y., Bayesian optimal interval designs for phase I clinical trials, J. R. Stat. Soc: Ser. C: Appl. Stat. 64 (2015), pp. 507–523. [Google Scholar]
- 21.Liu R., Yuan Y., Sen S., Yang X., Jiang Q., Li X., Lu C., Göneng M., Tian H., Zhou H., and Lin R., Accuracy and safety of novel designs for phase I drug-combination oncology trials, Stat. Biopharm. Res. (2022), pp. 1–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Love S.B., Brown S., Weir C.J., Harbron C., Yap C., Gaschler-Markefski B., Matcham J., Caffrey L., McKevitt C., Clive S., and Craddock C., Embracing model-based designs for dose-finding trials, Br. J. Cancer. 117 (2017), pp. 332–339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.O'Quigley J., Pepe M., and Fisher L., Continual reassessment method: a practical design for phase 1 clinical trials in cancer, Biometrics 46 (1990), pp. 33–48. [PubMed] [Google Scholar]
- 24.Pan H., Lin R., Zhou Y., and Yuan Y., Keyboard design for phase I drug-combination trials, Contemp. Clin. Trials. 92 (2020). p. 105972. [DOI] [PubMed] [Google Scholar]
- 25.Reiner E., Paoletti X., and O'Quigley J., Operating characteristics of the standard phase I clinical trial design, Comput. Stat. Data. Anal. 30 (1999), pp. 303–315. [Google Scholar]
- 26.Riviere M.K., Dubois F., and Zohar S., Competing designs for drug combination in phase I dose-finding clinical trials, Stat. Med. 34 (2015), pp. 1–12. [DOI] [PubMed] [Google Scholar]
- 27.Riviere M.K., Le Tourneau C., Paoletti X., Dubois F., and Zohar S., Designs of drug-combination phase I trials in oncology: A systematic review of the literature, Ann. Oncol. 26 (2015), pp. 669–674. [DOI] [PubMed] [Google Scholar]
- 28.Riviere M.K., Yuan Y., Dubois F., and Zohar S., A bayesian dose-finding design for drug combination clinical trials based on the logistic model, Pharm. Stat. 13 (2014), pp. 247–257. [DOI] [PubMed] [Google Scholar]
- 29.Simon R., Rubinstein L., Arbuck S.G., Christian M.C., Freidlin B., and Collins J., Accelerated titration designs for phase I clinical trials in oncology, J. Natl. Cancer. Inst. 89 (1997), pp. 1138–1147. [DOI] [PubMed] [Google Scholar]
- 30.Thall P.F., Millikan R.E., Mueller P., and Lee S.J., Dose-finding with two agents in phase I oncology trials, Biometrics 59 (2003), pp. 487–496. [DOI] [PubMed] [Google Scholar]
- 31.Tighiouart M., Li Q., and Rogatko A., A bayesian adaptive design for estimating the maximum tolerated dose curve using drug combinations in cancer phase I clinical trials, Stat. Med. 36 (2017), pp. 280–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wages N.A. and Conaway M.R., Specifications of a continual reassessment method design for phase I trials of combined drugs, Pharm. Stat. 12 (2013), pp. 217–224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Wages N.A., Conaway M.R., and O'Quigley J., Continual reassessment method for partial ordering, Biometrics 67 (2011), pp. 1555–1563. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wages N.A., Conaway M.R., and O'Quigley J., Dose-finding design for multi-drug combinations, Clin. Trials. 8 (2011), pp. 380–389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wages N.A. and Varhegyi N., pocrm: an r-package for phase I trials of combinations of agents, Comput. Methods Programs. Biomed. 112 (2013), pp. 211–218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Wang K. and Ivanova A., Two-dimensional dose finding in discrete dose space, Biometrics 61 (2005), pp. 217–222. [DOI] [PubMed] [Google Scholar]
- 37.Yan F., Mandrekar S.J., and Yuan Y., Keyboard: A novel Bayesian toxicity probability interval design for phase I clinical trials, Clin. Cancer. Res. 23 (2017), pp. 3994–4003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Yin G. and Yuan Y., A latent contingency table approach to dose finding for combinations of two agents, Biometrics 65 (2009), pp. 866–875. [DOI] [PubMed] [Google Scholar]
- 39.Yin G. and Yuan Y., Bayesian dose finding in oncology for drug combinations by copula regression, J. R. Stat. Soc: Ser. C (Applied Statistics) 58 (2009), pp. 211–224. [Google Scholar]
- 40.Yuan Y., Hess K.R., Hilsenbeck S.G., and Gilbert M.R., Bayesian optimal interval design: A simple and well-performing design for phase I oncology trials, Clin. Cancer Res. 22 (2016), pp. 4291–4301. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
