Abstract
In this article we address the problem of estimating minimum effective doses in dose-finding clinical trials of multidimensional treatment. We are motivated by a behavioral intervention trial where we introduce sedentary breaks to subjects with a goal to reduce their glucose level monitored over 8 hours. Each sedentary break regimen is defined by two elements: break frequency and break duration. The trial aims to identify minimum combinations of frequency and duration that shift mean glucose, that is, the minimum effective dose (MED) combinations. The means of glucose reduction associated with the dose combinations are only partially ordered. To circumvent constrained estimation due to partial ordering, we propose estimating the MED by maximizing a weighted product of combinationwise posterior gains. The estimation adopts an asymmetric gain function, indexed by a decision parameter , which defines the relative gains of a true negative decision and a true positive decision. We also introduce an adaptive -tapering algorithm to be used in conjunction with the estimation method. Simulation studies show that using asymmetric gain with a carefully chosen is critical to keeping false discoveries low, while -tapering adds to the probability of identifying truly effective doses (i.e., true positives). Under an ensemble of scenarios for the sedentary break study, -tapering yields consistently high true positive rates across scenarios and achieves about 90% true positive rate, compared to 68% by a nonadaptive design with comparable false discovery rate.
Key words and phrases. Adaptive dose-finding, generalized PIPE, glucose monitoring, sedentary breaks, weighted posterior gain
1. Introduction.
In early phase clinical trials of a new treatment, dose-finding is a critical step to establish a promising dose for further clinical evaluation. Depending on the specific context, an early phase trial may aim to estimate the maximum tolerated dose that causes toxicity with a prespecified probability (Cheung (2007), O’Quigley, Pepe and Fisher (1990)) or the minimum effective dose that shifts the mean expression level of a biologic endpoint (Hsu and Berger (1999)). In practice, a treatment is often multidimensional such as when it consists of two or more agents or when it is defined by two or more regimen elements. Numerous dose-finding methods have been proposed for phase I trials of dual pharmacological agents, where the objective is to estimate the maximum tolerated dose with respect to a binary toxicity endpoint (Riviere, Dubois and Zohar (2015)). Briefly, Thall et al. (2003), Braun and Wang (2010), and Wheeler et al. (2017) consider Bayesian parametric dose-toxicity models to facilitate dose escalation and estimation. Ivanova and Wang (2006) consider isotonic estimation for dose-finding on a two-dimensional grid. Wages, Conaway and O’Quigley (2011) propose a partial ordering continual reassessment method which reduces the problem to a one-dimensional dose-finding problem by first estimating the order of the dose combinations; this method aims to identify one dose combination rather than multiple combinations.
In this article we consider a behavioral intervention trial at our institution, where we intervene on sedentary behaviors. Technological advancements in the past 50 years have led to an increasingly sedentary lifestyle in developed nations, with adults in the U.S. estimated to now spend, on average, 11–12 hours per day sedentary (Diaz et al. (2016)). While sedentary behavior is strongly associated with incidence of cardiovascular disease and all-cause mortality (Biswas et al. (2015), Ekelund et al. (2019)), research has further implicated sedentary time accrued in prolonged, uninterrupted bouts such as sitting for hours at a time as potentially the most hazardous form of sedentary behavior; suggestive that regularly breaking up sedentary time with bouts of activity may be an important adjunct to existing physical activity guidelines (Diaz et al. (2017)). However, only general recommendations to “sit less, move more” have been proposed without specific, actionable targets (Young et al. (2016)). The lack of specific recommendations is attributed to a dearth of empiric data to inform more quantitative guidelines with respect to dose. This knowledge gap has motivated our trial wherein we study the effects of regularly breaking up sedentary time (henceforth referred to as sedentary breaks) on glucose levels, measured as areas under the curve (Allison et al. (1995)), over an eight-hour time period when compared to a control period. As each sedentary break regimen is defined by two elements, namely, frequency of breaks and duration of each break, the aims of the sedentary break study are to determine how frequently should periods of prolonged sedentary time be interrupted (e.g., every 30 minutes, every 60 minutes, etc.) and to determine how long should periods of prolonged sedentary time be interrupted (e.g. one minute, five minutes, etc.). While we expect a monotone increasing relationship between break frequency/duration (“dose”) and glucose reduction, the effect will plateau after an adequately high dose. To enhance the feasibility of sedentary breaks, we aim to identify the minimum levels of break frequency and duration that shift the mean glucose reduction. In addition, since the two study aims—minimum frequency and minimum duration—are interdependent in that breaks at a higher frequency may be coupled with a shorter duration, and vice versa, it is likely that more than one minimum combination of frequency and duration will achieve the target reduction. For example, under Scenario 1 in Figure 1, a break duration of seven minutes (duration level 4) or more is effective provided that a break is introduced every 90 minutes or more frequently (frequency level 2 or above). In contrast, Scenario 2 represents a situation where the effective duration depends on the break frequency in a more nuanced manner. This problem is thus formulated as estimating the minimum effective dose (MED) combinations on a multidimensional grid, which we will formally define in Section 2.
Fig. 1.

Two examples of dose-outcome scenarios for the sedentary break study with five frequency levels and five duration levels. A “+” indicates a combination is effective, and “−” ineffective. The minimum effective dose(s) are indicated as circles, and effective combinations on the boundary are marked by dotted lines.
The above mentioned dose-finding methods for a binary toxicity endpoint, among many others, use estimation procedures that respect the monotone dose-outcome constraint and, in principle, can be modified for MED estimation in our trial where the outcome is a continuous efficacy endpoint. On the other hand, whether these methods are directly applicable in our context requires further examinations for several reasons. First, the statistical performance of model-based methods may depend on correct or near-correct model assumptions. Robust parametric estimation can be difficult when the dose-outcome relationship is flat or plateaus (Cheung (2007)) or when many dose combinations are tested. Second, as these methods use constrained estimation with respect to the partial dose orders, computations on a large or multidimensional grid may prove challenging and intractable. Third, the dose-finding literature mainly focuses on drug trials where ethical considerations necessitate the use of adaptive dosing algorithms that assign doses away from very low or very high doses. The short-term intervention and the reversible nature of the endpoint in the sedentary break study afford relatively relaxed restrictions as to how study subjects are dosed. Therefore, it is worth re-examining the advantages of adaptive algorithms other than ethical considerations.
Mander and Sweeting (2015) propose a two-dimensional dose estimation method, which circumvents the modeling of partial dose orders and estimates the toxicity probability of each dose combination, based only on the outcomes observed at that combination using a beta prior and the binomial likelihood. Because of independence among dose combinations, the joint posterior distribution is a product of independent combinationwise beta distributions. Thus, computation of any posterior quantities is fast. The authors also use an adaptive algorithm that updates the joint posterior distribution continuously during a trial as the basis for dose escalations. The method is thus called product of independent probability escalation (PIPE).
PIPE in its original form implicitly assigns equal gain value for a true positive decision and a true negative decision, while convention in clinical research appropriately places greater emphasis on the latter. In addition, Mander and Sweeting (2015) is motivated by applications with binary outcomes, whereas our present application deals with continuous outcomes. To extend the work in Mander and Sweeting (2015), we describe in Section 2 a generalized PIPE estimator using continuous outcomes with respect to a class of asymmetric gain functions. For brevity, the exposition focuses on two-dimensional combinations in the context of the sedentary break study, although we also outline generalization to situations with treatment defined by more than two elements. In Section 3, we discuss adaptive dose-escalation algorithms based on the proposed generalized PIPE estimator, introduce an -tapering approach to improve trial performance, and illustrate the methods with numerical examples. The methods described in Sections 2 and 3, when put together, provide a novel general framework for MED estimation. In Section 4 we apply the framework to design the sedentary break study and describe an ensemble calibration process to specify the gain function appropriate for the application. The method’s performance is evaluated and compared using simulation in Section 5. We end this article with a discussion in Section 6.
2. Minimum effective dose estimation.
2.1. Notations and problem formulation.
In this section we introduce the notations and the problem in the context of the sedentary break study, where we consider two elements of the intervention, namely, frequency and duration . Let and denote the respective numbers of levels of and , and let be the set of all combinations considered in the study. In our study we consider all possible combinations on the two-dimensional grid, that is, , although the proposed methods can be applied to any arbitrary . Let denote glucose reduction of the ith subject in combination which is normally distributed with mean and variance . We assume monotonicity
| (1) |
so that the combination means are partially ordered.
Let denote an effective dose configuration, where is the indicator that combination is effective. While there are possible ’s on a full two-dimensional grid with no constraint, the set of configurations that satisfies (1) and consists of only configurations. Now, for any , we define the set of minimum effective doses
with the convention . It can be verified that the minimum effective doses uniquely define the effective dose configuration under monotonicity (1).
2.2. Maximizing posterior gain with an asymmetric function.
Consider, for the moment, estimation of , using data , observed for a given combination . For any estimator , we adopt the following class of gain functions:
| (2) |
for some . Setting implies equal gain for the two correct decision types, whereas setting assigns a greater gain to a true negative (i.e., ) than a true positive (i.e., ). The gain of a false decision (i.e., ) is 0.
Applying standard computations, we can derive the Bayes estimator with respect to gain function (2),
| (3) |
where and denotes expectation taken with respect to the posterior distribution of given .
To estimate jointly, we propose extending (3) by maximizing a weighted product of posterior gains over all combinations,
| (4) |
where is a weight associated with combination . The maximizer (4) yields a generalized PIPE estimator for : In the case of binary data, when and , the estimator reduces to the original PIPE estimator in Mander and Sweeting (2015). As will be illustrated, fixing (i.e., the original PIPE estimator) will lead to false positive discoveries at a rate higher than conventionally expected in intervention trials. Intuitively, the weight may be set to reflect the information content available for dose so that the estimation places greater weights on doses with greater numbers of subjects. In this article we will use , where is the sample size of dose . Note that we can use unconstrained ’s in (4), and partial ordering of will be preserved through maximization over the set . It thus circumvents the needs for constrained estimation of ’s, which can be computationally intensive, especially when the grid is large.
To make the evaluation concrete, consider the semiconjugate prior: and a priori independently for each . As a result, the posterior distribution of can be simulated from , where
and is the sample mean of the observations at the dose, after drawing ’s from the marginal posterior . The function denotes standard normal density function, and is the normal likelihood given . In effect, these simulation steps give an approximation to the single integral
| (5) |
where is the standard normal distribution function. Then, is solved by plugging (5) into (4), and the minimum effective dose combinations are estimated by .
We note that this proposed estimation method involves the computation of single integrals (5) and iterating over products in (4). These are simple operations, each of which can be efficiently computed.
Generalization of (4) to situations with multidimensional treatments is straightforward. For an -dimensional treatment, let denote the dose at level of the th element. Then, the objective function becomes
While the number of computations increases exponentially with the dimension when a full grid is considered, the actual computation intensity depends only on the number of dose combinations considered in a study. In addition, because ’s are calculated independently, the estimation method involves the same basic computational operations regardless of and is computationally scalable.
3. Adaptive dose escalation algorithms.
3.1. Dose-finding with asymmetric gain.
In a nonadaptive PIPE design, subjects are randomized evenly to each combination, and the MED is estimated by evaluating at the end. Even when the grid size is only moderately large, the number of subjects per dose can be small, unless the sample size is large, thus resulting in low information content at all doses. For example, the sedentary break study tests frequency levels and duration levels, that is, a total of 25 combinations. With an anticipated , a balanced study design will enroll subjects per dose. As an alternative, we consider sequentially assigning dose combinations, using data accrued in the study, to guide dosing in the proximity of the true . Specifically, dose escalation during the trial can be conducted per the following.
Algorithm 1.
Adaptive PIPE with asymmetric gain:
| 0. Set the dose for the first subject at the highest level . |
| 1. Enroll and treat a subject at a time. |
| 2. Evaluate the PIPE estimator using a prespecified based on the most updated data. |
| 3. Define the sampling grid that consists of all effective doses on the boundary , according to , and their ineffective neighboring doses . Precisely, the boundary effective doses are identified as , and the ineffective neighbors . In the case of , the sampling grid consists of the singleton . |
| 4. Go to Step 1 with the dose randomly drawn from . We will consider two variations of Algorithm 1 with different randomization schemes: |
| (a) Select all doses in with equal probability (Algorithm 1-EQ), or, |
| (b) Select a dose with probability proportional to (Algorithm 1-PP). |
Algorithm 1 aims to explore the effective and ineffective doses near the boundary, so as to allocate resources away from doses the data identify as effective or ineffective with a relatively high level of confidence. Mander and Sweeting (2015) provide technical rationales for sampling doses below , thus motivating the sampling of doses among without regard to their effectiveness per Step 4a. Alternatively, one could favor doses that are empirically superior in terms of per Step 4b. These two variations of Algorithm 1 will be referred to as Algorithm 1-EQ and Algorithm 1-PP, respectively. We also note that the starting dose in Step 0 may be replaced with a dose drawn from , defined by , based on the prior distribution; this prior, however, will coincide with the highest level when an asymmetric gain with is used.
3.2. PIPE with -tapering.
From a practical viewpoint, Algorithm 1 with a “small” may treat many subjects at the higher doses before it explores the lower doses, whereas a “large” may lead to high false discovery rate in the estimation. To improve exploration during the trial without inflating false discoveries at the end, we propose an -tapering algorithm, whereby a decreasing sequence is used to evaluate as the trial progresses.
Algorithm 2.
Adaptive PIPE with -tapering:
| 0. Set the dose for the first subject at the highest level . |
| 1. Enroll and treat a subject at a time. |
| 2. Evaluate using based on data in the current subjects. |
| 3. Define the sampling grid in the same manner as in Algorithm 1. |
| 4. Go to Step 1 with the dose randomly drawn from , with two variations: |
| (a) Select all doses in with equal probability (Algorithm 2-EQ), or, |
| (b) Select a dose with probability proportional to (Algorithm 2-PP). |
The two variations of Algorithm 2, per the randomization schemes in Step 4a and Step 4b, will be referred to as Algorithm 2-EQ and Algorithm 2-PP, respectively. For -tapering in Step 2, we consider a linear sequence, that is,
| (6) |
according to which the final is estimated with . While the sequence can be used with any prespecified , we expect in most applications.
Note that evaluating the PIPE estimator (4) with independent reduces computational intensity of the adaptive algorithms: Step 2 in the algorithms after each new subject involves computing one single integral (5), instead of single integrals or a multidimensional integral. This saving is substantial, and is critical for design calibration, which requires simulating a large number of trials under an ensemble of scenarios; see Section 4.
3.3. Numerical illustration.
To illustrate how the adaptive PIPE operates, Table 1 gives the outcomes and doses of the first 12 subjects in two simulated trials of , using Algorithms 1-EQ and 2-EQ, and gives the posterior updates as the trials progress; see Table 2 in Section 4.2 for details of these algorithms. Both algorithms use the semiconjugate prior, specified in Section 2.2 with , and the prior estimate .
Table 1.
Illustration of two adaptive PIPE designs: (A) Algorithm 2-EQ adopts -tapering with linear sequence (6) and . (B) Algorithm 1-EQ uses a fixed ; see Section 4.2 for the choices of the respective values. Each row shows dose and outcome of a subject and the corresponding updates of and
| (A) Algorithm 2-EQ with | (B) Algorithm 1-EQ with | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | (5,5) | 0.28 | 0.94 | {(5,5)} | 0.493 | (5,5) | 0.28 | 0.94 | ∅ |
| 2 | (5,4) | 0.13 | 0.92 | {(5,4)} | 0.489 | (5,5) | 0.13 | 0.95 | {(5,5)} |
| 3 | (4,5) | −0.17 | 0.078 | {(5,4)} | 0.486 | (4,5) | 0.23 | 0.94 | {(5,5)} |
| 4 | (4,5) | −0.00 | 0.19 | {(5,4)} | 0.482 | (5,4) | 0.25 | 0.94 | {(5,5)} |
| 5 | (4,5) | 0.24 | 0.57 | {(4,5),(5,4)} | 0.478 | (5,5) | 0.29 | 0.99 | {(5,5)} |
| 6 | (5,3) | 0.10 | 0.91 | {(4,5),(5,3)} | 0.475 | (5,4) | −0.03 | 0.76 | {(5,5)} |
| 7 | (5,2) | −0.06 | 0.11 | {(4,5),(5,3)} | 0.471 | (5,5) | 0.20 | 0.99 | {(5,5)} |
| 8 | (4,3) | 0.11 | 0.91 | {(4,3)} | 0.467 | (5,4) | −0.09 | 0.66 | {(5,5)} |
| 9 | (3,4) | 0.15 | 0.93 | {(3,4),(4,3)} | 0.464 | (5,5) | 0.37 | 0.99 | {(5,5)} |
| 10 | (4,3) | −0.31 | 0.33 | {(3,4),(5,3)} | 0.460 | (5,4) | 0.26 | 0.82 | {(5,5)} |
| 11 | (2,5) | 0.15 | 0.93 | {(2,5),(3,4),(5,3)} | 0.456 | (5,5) | 0.16 | 0.99 | {(5,5)} |
| 12 | (4,3) | 0.05 | 0.37 | {(2,5),(3,4),(5,3)} | 0.453 | (4,5) | 0.11 | 0.96 | {(4,5)} |
Table 2.
Calibration results of five PIPE designs for the sedentary break study with . The calibrated designs are designs with the largest that meet the criterion
| Method | Description | Randomization (Step 4) | FDR | TPR | ||
|---|---|---|---|---|---|---|
| Non-adaptive | 5 subjects per combination | – | 0.0525 | 0.046 | 0.68 | 63 |
| Algorithm 1-EQ | Adaptive PIPE fixed | equal among | 0.0500 | 0.048 | 0.79 | 103 |
| Algorithm 1-PP | Adaptive PIPE fixed | proportional to | 0.0625 | 0.048 | 0.78 | 111 |
| Algorithm 2-EQ | Adaptive PIPE -tapering | equal among | 0.0425 | 0.046 | 0.88 | 66 |
| Algorithm 2-PP | Adaptive PIPE -tapering | proportional to | 0.0525 | 0.049 | 0.91 | 91 |
The first subject was treated at the highest dose (5, 5), and had an outcome of 0.28. As a result, Algorithm 2 updated , that is, and for all as shown in Table 1A. This in turn defined and , from the union of which the dose for the second subject would be drawn. The second subject then was assigned (5, 4) and had an outcome of 0.13, thus moving toward the lower ends of the dose range. While a gradually decreasing sequence was used as the trial proceeded per Algorithm 2, the initial were relatively large and allowed exploration of the lower dose range after relatively few subjects. In contrast, as shown in Table 1B, Algorithm 1 took 12 subjects before it moved the MED estimate to combination (4, 5), illustrating a much smaller exploration range than Algorithm 2.
4. Design calibration with ensemble simulation.
4.1. Application: Sedentary break study.
In this section we consider applying adaptive PIPE (Algorithms 1-EQ, 1-PP, 2-EQ, and 2-PP) to the sedentary break study and compare them with the nonadaptive PIPE where each combination has an equal number of subjects. With given prior distribution and total sample size, a PIPE design (adaptive or nonadaptive) is completely specified by the decision parameter in the gain function (2). We take a design calibration approach that chooses to keep false positive discoveries low when averaged across an ensemble of scenarios. Specifically, with , there are a total of 252 effective dose configurations s in . For each , we define a plateau dose-outcome scenario with mean for a truly effective dose (i.e., ) and for an ineffective dose (i.e., ). We set the true in all scenarios based on our pilot data. For each , we simulate 10,080 trials under the ensemble, that is, 40 replicates under each of the 252 scenarios. The sample size of each simulated trial was 125.
To define the metric for false positive discoveries, let denote the number of estimated minimum effective dose combinations at the end of a simulated trial. Then, the false MED discovery proportion in a simulated trial is evaluated as
| (7) |
and the proportion is defined as 0 if so that . The overall false discovery rate (FDR) is calculated by averaging (7) over the 10,080 simulated trials. As a larger value of corresponds to assigning less gain to a true negative decision, it intuitively results in discovering more doses as positive. Thus, the design calibration process aims to identify the largest with .
In addition to tracking false positive discoveries, we evaluate true positives associated with each PIPE design and each given . Specifically, we evaluate the proportion of truly effective doses identified as effective in a simulated trial,
| (8) |
The overall true positive rate (TPR) is calculated by averaging (8) over the 10,080 simulated trials. In our application the sample size is determined based on enrollment feasibility. Thus, different PIPE designs, with the same sample size and comparable FDR, can be compared on the basis of TPR. As an ethic metric, we also record the average number of subjects treated at an effective dose.
4.2. Calibration results.
We ran the calibration process for the nonadaptive PIPE and the four adaptive algorithms (1-EQ, 1-PP, 2-EQ, and 2-PP) with ranging from 0.01 to 0.10 with increments of 0.0025. Table 2 summarizes the calibration results of the methods, along with their operating characteristics. A few observations are noted. First, as expected, the proportions of positive findings—false findings indicated by FDR and correct findings by TPR—increase for all methods as increases (results not shown here). Second, the correspondence between and FDR varies slightly among the designs. For FDR at around 0.05, Algorithm 2-EQ requires the smallest , while Algorithm 1-PP allows the largest . Thus, each PIPE design will need to be calibrated separately to achieve the target FDR, although the range is quite narrow. Third, after controlling for FDR, all four adaptive PIPE algorithms give higher TPR than the nonadaptive design. Using -tapering per Algorithm 2 further improves TPR, when compared to adaptive PIPE using a fixed per Algorithm 1.
Adaptive PIPE using a fixed (Algorithm 1) has greater than Algorithm 2, because the former explores away from the higher dose range at a slower rate (cf. Table 1B). Dose allocation can be improved by randomizing subjects to doses with probability proportional to per Algorithms 1-PP and 2-PP, thus favoring doses that are empirically superior. Additionally, increase in thus obtained has with minimal impact on TPR.
5. Comparison of the calibrated designs.
To further understand the properties of the various PIPE designs, we compare the calibrated designs in specific dose-outcome scenarios with additional simulation studies.
In the first set of simulation, Figure 2 shows the operating characteristics of the calibrated designs stratified by the number of true effective doses in the ensemble simulation described in the previous section. The FDRs of all designs are quite comparable and generally decline as the number of true effective doses increases (left of the figure). In contrast, the TPR of the designs depend on the number of true effective doses in different ways (middle of Figure 2). Adaptive PIPE with fixed (Algorithm 1) has diminishing TPR as the number of true effective doses increases, that is, when the true MED is near the lower end of the dose combination grid. This is due to the method’s conservative sampling approach. The TPRs of nonadaptive PIPE span a narrow range across the scenarios. For PIPE with -tapering (Algorithm 2), TPR dips in scenarios where the true MED is among the middle combinations on the grid. However, the accuracy of Algorithm 2 (both EQ and PP) is generally higher than the other methods across the simulation scenarios (right of the figure).
Fig. 2.

Operating characteristics of the calibrated designs in the ensemble simulation; each point represents an average value under a set of scenarios with the same number of true effective doses. The calibrated designs are described in Table 2.
In the second set of simulations, we compare the calibrated designs and a nonadaptive PIPE with a symmetric gain function (i.e., ) under the two scenarios depicted in Figure 1. We generated 1000 simulated trials for each method and considered plateau dose-outcome relationship, as in the ensemble simulation scenarios.
Figure 3 shows the methods’ performance under Scenario 1, where the number of true effective doses is relatively small (eight out of 25). The adaptive algorithms are superior to the calibrated nonadaptive PIPE in terms of accuracy. In particular, the true MED(2, 4) is correctly estimated as effective with a 28% probability by the nonadaptive PIPE, compared to 53%–80% by the adaptive algorithms. While nonadaptive PIPE using a symmetric gain function, that is, setting (upper left figure) identifies combination (2, 4) as effective with 99% probability, it also commits high false positive rates at the true ineffective doses.
Fig. 3.

Operating characteristics of various PIPE designs specified in Table 2 under Scenario 1 in Figure 1. The size of a circle is proportional to the average sample size at a dose, with black circle indicating a true effective dose and white circle ineffective , where for all . The number in each circle is the probability the dose is estimated to be effective.
Figure 4 shows the results under Scenario 2 which has 13 truly effective doses. While the adaptive algorithms yield better accuracy than the calibrated nonadaptive design, adaptive PIPE with fixed has some difficulty in identifying the effective combination (1, 4), with probability of 17% and 22%, respectively, for Algorithms 1-EQ and 1-PP. In fact, the simulation results show that very few subjects are treated at this combination during the trial under both variations of Algorithm 1. In contrast, using -tapering, Algorithm 2 concentrates sample around the boundary of true effective and ineffective doses in both scenarios, allowing experimentation near the target dose range. Nonadaptive PIPE with symmetric gain again produces unacceptably high false positive discoveries of true ineffective combinations.
Fig. 4.

Operating characteristics of various PIPE designs specified in Table 2 under Scenario 2 in Figure 1. The size of a circle is proportional to the average sample size at a dose, with black circle indicating a true effective dose and white circle ineffective , where for all . The number in each circle is the probability the dose is estimated to be effective.
6. Discussion.
In this article we have proposed and examined an adaptive dose escalation framework which leverages an asymmetric gain function for decision-making and incorporates a novel design concept of -tapering. Algorithm 1 and Algorithm 2 both employ the notion of continual reassessment, often used in drug trials, and use interim MED estimates to guide dose assignments (O’Quigley, Pepe and Fisher (1990)). However, there is a critical distinction between the two algorithms: -tapering in Algorithm 2 implements gradual changes in the estimation objective gain function throughout a trial, thus moving experimentation quickly to the target dose range early while keeping false discoveries low at the end. Simulation studies have indicated that -tapering improves accuracy (true positive rates) substantially in our application, and we anticipate this strategy will be effective when the grid size or grid dimension is large so that quick early exploration is critical. Our simulation also shows that adaptive PIPE with -tapering outperforms the nonadaptive PIPE in terms of accuracy and , thus indicating the advantages of adaptation over simple randomization.
We adapt the original PIPE estimator proposed by Mander and Sweeting (2015), which enjoys computational ease and scalability to multidimensional treatments, and describe a decision-theoretic framework to motivate a generalized PIPE estimator. This extends Mander and Sweeting’s estimator in two ways. First, while the original PIPE focuses on binary outcomes and this article on continuous normal models, the framework lends itself to estimation procedure for different outcome types and distribution models. This broadens the utility of the framework. Second, and importantly, the decision-theoretic framework is specified by a decision parameter, , via the definition of the gain function. While has a nice interpretation in terms of weighing between true positive and true negative decisions, it can be empirically calibrated to yield high accuracy over an ensemble of dose-outcome scenarios. The ensemble calibration steps are broadly applicable to different settings and contexts. In this article we consider plateau scenarios, each having equal weight of being chosen from the ensemble. Other dose-outcome scenarios, such as linearly increasing dose-outcome, can be considered for calibration purposes; unequal weights may also be given to the scenarios in the ensemble if prior information is available. The use of plateau dose-outcome curves is often advantageous, however, because they represent the least favorable configurations and yield conservative estimate of a method’s performance (Cheung (2007), Lee and Cheung (2009)).
While this article focuses on the nonparametric generalized PIPE estimator , the use of parametric estimation under the proposed decision-theoretic framework is straightforward, in principle: one could evaluate ’s based on a parametric dose-outcome model for (e.g., increases linearly in ) and then plug them in the posterior gain (4) for maximization. Intuitively, this approach will improve the accuracy of MED estimation, provided that the model is correctly specified. However, postulating a correct parametric form for a multidimensional dose-outcome surface can be elusive. Also, it has been demonstrated (in one-dimensional settings) that parametric methods perform poorly under model misspecification, especially when the true underlying dose-outcome relationship is flat (Cheung (2007)). Thus, the use of parametric methods should be limited to situations where domain knowledge is available to inform the choice of an appropriate parametric function.
As our work is motivated by the sedentary break study, some practical notes are in order. First, the study aims to identify effective doses under a controlled condition: each subject will be given regimented activity during breaks (walking on a treadmill at a fixed speed) and will receive the same routine during the eight-hour period in a lab (e.g., food, meal times) so that variability across subjects and break periods, other than the experimental condition, will be kept to a minimum. Second, the study aims to identify the minimum effective doses rather than doses that maximize glucose reduction. While feasibility is not an issue in the present study because of the experimental environment, feasibility of a burdensome intervention will be a concern in pragmatic settings. A high sedentary break dose may maximize physiological benefits, but if few want to follow it, then its public health relevance is questionable.
Third, the primary study endpoint is efficacy, and toxicity is not a main concern. When applying the proposed method to phase II trials of dual pharmaceutical agents and other trial contexts, one should select the test doses based on prior phase I safety trials, for example, considering only doses below the maximum tolerated dose combinations. We also note that the proposed decision-theoretic framework can readily be adapted for phase I combination trials for the estimation of maximum tolerated dose for which the original PIPE is first developed. In these situations, doses may be allocated using randomization scheme analogous to Algorithm 2-PP to minimize risk of toxicity of the study subjects. Finally, the sedentary break study uses an endpoint that is immediately available. This is an important practical criterion for deploying a fully sequential design. For trials of drugs and radiation therapy, it is not uncommon that there are delays in endpoint evaluation or long-term intervention is expected.
One may employ practical modifications, such as updating the estimator only after a small group of subjects instead of after every subject. Alternatively, one may consider using a time-to-event approach (Cheung and Chappell (2000), Wheeler, Sweeting and Mander (2019)). The impact of these modifications on the method’s performance will be of great interest in these applications.
Acknowledgments.
The authors would like to thank the anonymous referees, an Associate Editor, and the Editor for their constructive comments that improved the quality of this paper.
Funding.
This work is supported by NIH grants R01HL153642, R01MH109496, and UL1TR001873.
REFERENCES
- Allison DB, Paultre F, Maggio C, Mezzitis N and Pi-Sunyer FX (1995). The use of areas under curves in diabetes research. Diabetes Care 18 245–250. [DOI] [PubMed] [Google Scholar]
- Biswas A, Oh PI, Faulkner GE, Bajaj RR, Silver MA, Mitchell MS and Alter DA (2015). Sedentary time and its association with risk for disease incidence, mortality, and hospitalization in adults: A systematic review and meta-analysis. Ann. Intern. Med 162 123–132. [DOI] [PubMed] [Google Scholar]
- Braun TM and Wang S (2010). A hierarchical Bayesian design for phase 1 trials of novel combinations of cancer therapeutic agents. Biometrics 66 805–812. MR2758216 10.1111/j.1541-0420.2009.01363.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheung YK (2007). Sequential implementation of stepwise procedures for identifying the maximum tolerated dose. J. Amer. Statist. Assoc 102 1448–1461. MR2446206 10.1198/016214507000000699 [DOI] [Google Scholar]
- Cheung YK and Chappell R (2000). Sequential designs for phase I clinical trials with late-onset toxicities. Biometrics 56 1177–1182. MR1815616 10.1111/j.0006-341X.2000.01177.x [DOI] [PubMed] [Google Scholar]
- Diaz KM, Howard VJ, Hutto B, Colabianchi N, Vena JE, Blair SN and Hooker SP (2016). Patterns of sedentary behavior in US middle-age and older adults: The REGARDS study. Med. Sci. Sports Exerc 48 430–438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diaz KM, Howard VJ, Hutto B, Colabianchi N, Vena JE, Safford MM, Blair SN and HoOKER SP (2017). Patterns of sedentary behavior and mortality in US middle-age and older adults: A national cohort study. Ann. Intern. Med 167 465–475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ekelund U, Tarp J, Steene-Johannessen J, Hansen BH, Jefferis B, Fagerland MW, Whincup P, Diaz KM, Hooker SP et al. (2019). Dose-response associations between accelerometry measured physical activity and sedentary time and all cause mortality: Systematic review and harmonised meta-analysis. Br. Med. J 366 I4570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hsu JC and Berger RL (1999). Stepwise confidence intervals without multiplicity adjustment for dose-response and toxicity studies. J. Amer. Statist. Assoc 94 468–482. [Google Scholar]
- Ivanova A and Wang K (2006). Bivariate isotonic design for dose-finding with ordered groups. Stat. Med 25 2018–2026. MR2239229 10.1002/sim.2312 [DOI] [PubMed] [Google Scholar]
- Lee SM and Cheung YK (2009). Model calibration in the continual reassessment method. Clin. Trials 6 227–238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mander AP and Sweeting MJ (2015). A product of independent beta probabilities dose escalation design for dual-agent phase I trials. Stat. Med 34 1261–1276. MR3322767 10.1002/sim.6434 [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Quigley J, Pepe M and Fisher L (1990). Continual reassessment method: A practical design for phase 1 clinical trials in cancer. Biometrics 46 33–48. MR1059105 10.2307/2531628 [DOI] [PubMed] [Google Scholar]
- Riviere M-K, Dubois F and Zohar S (2015). Competing designs for drug combination in phase I dose-finding clinical trials. Stat. Med 34 1–12. MR3286233 10.1002/sim.6094 [DOI] [PubMed] [Google Scholar]
- Thall PF, Millikan RE, Mueller P and Lee S-J (2003). Dose-finding with two agents in Phase I oncology trials. Biometrics 59 487–496. MR2004253 10.1111/1541-0420.00058 [DOI] [PubMed] [Google Scholar]
- Wages NA, Conaway MR and O’Quigley J (2011). Continual reassessment method for partial ordering. Biometrics 67 1555–1563. MR2872406 10.1111/j.1541-0420.2011.01560.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wheeler GM, Sweeting MJ and Mander AP (2019). A Bayesian model-free approach to combination therapy phase I trials using censored time-to-toxicity data. J. R. Stat. Soc. Ser. C. Appl. Stat 68 309–329. MR3902996 10.1111/rssc.12323 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wheeler GM, Sweeting MJ, Mander AP, Lee SM and Cheung YKK (2017). Modelling semi-attributable toxicity in dual-agent phase I trials with non-concurrent drug administrations. Stat. Med 36 225–241. MR3582970 10.1002/sim.6912 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Young DR, Hivert MF, Alhassan S, Camhi SM, Ferguson JF, Katzmarzyk PT, Lewis CE, Owen N, Perry CK et al. (2016). Sedentary behavior and cardiovascular morbidity and mortality: A science advisory from the American heart association. Circulation 134 e262–79. [DOI] [PubMed] [Google Scholar]
