Optimizing natural killer cell doses for heterogeneous cancer patients on the basis of multiple event times

Juhee Lee; Peter F Thall; Katy Rezvani

doi:10.1111/rssc.12271

. Author manuscript; available in PMC: 2019 May 16.

Published in final edited form as: J R Stat Soc Ser C Appl Stat. 2018 Mar 15;68(2):461–474. doi: 10.1111/rssc.12271

Optimizing natural killer cell doses for heterogeneous cancer patients on the basis of multiple event times

Juhee Lee ^1,^*, Peter F Thall ², Katy Rezvani ³

PMCID: PMC6521706 NIHMSID: NIHMS938910 PMID: 31105345

Abstract

A sequentially adaptive Bayesian design is presented for a clinical trial of cord blood derived natural killer cells to treat severe hematologic malignancies. Given six prognostic subgroups defined by disease type and severity, the goal is to optimize cell dose in each subgroup. The trial has five co-primary outcomes, the times to severe toxicity, cytokine release syndrome, disease progression or response, and death. The design assumes a multivariate Weibull regression model, with marginals depending on dose, subgroup, and patient frailties that induce association among the event times. Utilities of all possible combinations of the nonfatal outcomes over the first 100 days following cell infusion are elicited, with posterior mean utility used as a criterion to optimize dose. For each subgroup, the design stops accrual to doses having an unacceptably high death rate, and at the end of the trial selects the optimal safe dose. A simulation study is presented to validate the design’s safety, ability to identify optimal doses, and robustness, and to compare it to a simplified design that ignores patient heterogeneity.

Keywords: Cellular Therapy, Dose Finding, Natural Killer Cells, Precision Medicine, Phase I–II Clinical Trial

1 Introduction

This paper describes a sequentially adaptive Bayesian design for an early phase clinical trial of umbilical cord blood derived natural killer (NK) cells as therapy for advanced hematologic diseases. NK cells are lymphocytes that can be used for cancer immunotherapy because they play a critical role in natural immune surveillance and are the body’s first line of defense against viruses and newly transformed cancer cells (Rezvani and Rouce, 2015). Patients may have chronic lymphocytic leukemia (CLL), acute lymphocytic leukemia (ALL), or non-Hodgkin’s lymphoma (NHL), with either low or high bulk disease (LBD or HBD), so disease type and bulk determine six prognostic subgroups. The trial’s primary goal is to identify an optimal NK cell dose for each subgroup. Rather than monitoring one toxicity as in phase I trials (Cheung, 2011) or bivariate (efficacy, toxicity) as in phase I–II (Yuan et al., 2016), there are five co-primary outcomes. These are the times from cell infusion to death (D), disease progression (P), response (R), severe toxicity (T), and severe cytokine release syndrome (C). Since T and C are most likely to occur soon after cell infusion, they are monitored for L_T = L_C = 100 days, while P,R, and D are monitored for L_P = L_R = L_D = 365 days. The time Y_j to event j is independently censored at L_j for each j ∈ {P,R, T,C,D}, death informatively censors all other events, and Y_P and Y_R are competing risks. Based on clinical experience, the Y_j’s are interdependent, the distribution of Y = (Y_R, Y_C, Y_P, Y_T, Y_D) varies between subgroups, and the adverse event times (Y_C, Y_P, Y_T, Y_D) are positively associated.

Denote NK cell dose by d, subgroup by g, and the probability of death prior to day 100 by π_D(d, g) = Pr(Y_D ≤ 100 | d, g) for all (g, d) pairs. In the design, if π_D(d, g) is unacceptably high compared to an elicited fixed upper limit π̄_D(g) for subgroup g, then d is discontinued in g. A conventional safety rule ignoring subgroups would discontinue d for all patients if an interim estimate of π_D(d) = Pr(Y_D ≤ 100 | d) is unacceptably high. This rule produces a design with high probabilities of making incorrect decisions within subgroups. For example, given historical values π̄_D(1) = 0.10 and π̄_D(2) = 0.40 with standard therapy for subgroups g = 1 (low risk) and g = 2 (high risk), if the interim estimate π̂_D(d) = 0.25, obtained by ignoring subgroups, triggers the decision to stop all accrual to dose d, this is likely to be correct for subgroup g = 1 but incorrect for g = 2.

A major logistical problem in trial conduct is that each patient’s outcome is based on five event times monitored up to 365 days. Each outcome is fully evaluated at its occurrence time, or censored by death or administratively at the end of its 100 or 365 day follow up period. To choose a dose adaptively for a new patient, it is likely that some outcomes of previous patients have not been evaluated fully, and it is not feasible to suspend accrual to wait for full evaluation of all previous patients’ data. This problem was solved for phase I trials by the time-to-event continual reassessment method (TiTE-CRM) (Cheung and Chappell, 2000). Phase I–II designs based on bivariate (efficacy, toxicity) event times have been proposed by Zhou et al. (2006) Thall et al. (2013), and Jin et al. (2014), among others.

In the NK cell trial, it is assumed that Pr(CLL) = Pr(ALL) = Pr(NHL) = 1/3, and Pr(LBD) = 1/3 and Pr(HBD)= 2/3. Thus, given maximum sample size N_max = 60, approximate expected subgroup sample sizes are (7, 7, 7, 13, 13, 13), and these subsamples are divided further among the NK cell doses. This limits the reliability of subgroup-specific safety monitoring and optimal dose selection. Based on current knowledge about NK cell biology, the six hazard functions may not be monotone in NK cell dose. Finally, given the five event time outcomes, what is meant by “optimal NK cell dose” in each subgroup is not obvious. We present a design that addresses all of these problems.

To provide a basis for dose-finding, we assume a Bayesian multivariate parametric regression model for Y, with Weibull marginals having scale parameters that vary with (d, g). A vector of patient frailties is introduced, with one frailty acting multiplicatively on the marginal hazard of each event time. The frailty vector accounts for additional variability not accounted for by subgroup, and also induces dependence between the elements of Y. Frailty models have been used widely for multivariate failure time data, including the multivariate log-normal distribution (Ripatti and Palmgren, 2000), competing risks (Gorfine and Hsu, 2011), and semi-competing risks (Lee et al., 2016).

Our design is nominally phase I–II since it includes the desirable event R and adverse events {P, T,C,D}, and is an example of precision medicine since decisions are subgroup-specific. Reviews of phase I–II designs are given by Zohar and Chevret (2007) and Yuan et al. (2016). Our use of posterior mean utility is similar to, for example, the phase I–II designs of Houede et al. (2010), and Thall et al. (2013), and the randomized trial design of Murray et al. (2016). Phase I–II designs accounting for patient heterogeneity have been given by Thall et al. (2008), Chen et al. (2016), and Guo and Yuan (2017).

Section 2 presents the probability model underlying the design. Section 3 describes utility computation, and Section 4 presents the design. In Section 5, a simulation study is presented to evaluate the design’s safety, ability to identify optimal doses, and robustness, and to compare it to a simplified design ignoring patient heterogeneity. We close with a brief discussion in Section 6.

2 Probability Model

2.1 Recording Event Times

For interim sample size n(t) ≤ N_max at trial time t, index patients in order of enrollment by i = 1, ···, n(t), with trial entry times 0 ≤ e₁ ≤ e₂ ≤ ··· ≤ e_n₍_t₎. For patient i, the trial time of event j for patient i is e_i + Y_i_,_j, if it is observed. For j = P,R or D, Y_i_,_j is followed until e_i +365, and Y_i_,_T and Y_i_,_C are followed until e_i +100. Let $Y_{i, j}^{o}$ denote the time of observation of Y_j or right-censoring, with δ_i_,_j = 1 if $Y_{i, j}^{o} = Y_{i, j}$ and 0 otherwise. At trial time t > e_i, if Y_i_,_D > t−e_i or Y_i_,_D > 365, then $Y_{i, D}^{o}$ is the time of independent right censoring (δ_i_,_D = 0). If Y_i_,_D < min(t−e_i, 365), then $Y_{i, D}^{o} = Y_{i, D}$ is the observed time of death (δ_i_,_D = 1). For nonfatal events j = T,C, if Y_i_,_j < min{t − e_i, L_j, Y_i_,_D}, then $Y_{i, j}^{o} = Y_{i, j} (δ_{i, j} = 1)$ and otherwise $Y_{i, j}^{o}$ is the time of right-censoring (δ_i_,_j = 0). If Y_i_,_P < min{t − e_i, L_P, Y_i_,_R, Y_i_,_D}, i.e. Y_i_,_P is observed, then $Y_{i, P}^{o} = Y_{i, R}^{o} = Y_{i, P}$ and (δ_i,P, δ_i,R) = (1, 0). If Y_i,R < min{t − e_i, L_R, Y_i,P, Y_i,D}, then $Y_{i, P}^{o} = Y_{i, R}^{o} = Y_{i, R}$ and (δ_i_,_P, δ_i_,_R) = (0, 1). If neither Y_P nor Y_R occurs, (δ_i_,_P, δ_i_,_R) = (0, 0). Since P and R are competing risks, (δ_i_,_P, δ_i_,_R) = (1, 1) is not possible. If censoring is due to the fact that Y_i_,_j > min{t − e_i, L_j} and the patient is alive at trial time t, then censoring is independent. For j ≠ D, if censoring of Y_i_,_j is due to death, i.e. Y_i_,_D < Y_i_,_j, then the censoring is not independent and $Y_{i, j}^{o} = Y_{i, D}^{o} (δ_{i, j} = 0)$ .

2.2 Sampling and Frailty Models

Index NK cell doses by d = 1, 2, 3, and define Z = 0 for LBD and Z = 1 for HDB. Index r = 1, 2, 3, for disease types CLL, ALL, NHL. Denote Z = (Z, r), which replaces the subgroup index g. For i = 1, ···, n(t), denote dose by d_i and covariates by Z_i = (Z_i, r_i). Generalizing the description of two semi-competing risks given by Fine et al. (2001), here the joint distribution of Y is defined on the set 𝒴_D = {y ⊂ [0,∞)⁵: max(y_P, y_R, y_C, y_T) < y_D}, since death censors any nonfatal event but not conversely. On 𝒴_D, Y_R and Y_P have the usual competing risks structure, with at most one observed. Thus, h_R and h_P are subhazard functions (cause-specific hazards) where h_j(y), j = R, P is interpreted as the hazard of R or P occurring at time y and being j, and h_R(y) + h_P (y) is the hazard of either P or R on 𝒴_D. In the sequel we will abuse conventional terminology by referring to either hazards or subhazards as “hazard functions,” and do the same for survivor functions S_j(y).

We assume Weibull marginal event time distributions,

Y_{i, j} ∣ α_{j}, λ_{i, j} \overset{indep}{\sim} Weibull (α_{j}, exp (λ_{i, j})), i = 1, \dots, n (t) and j \in {P, R, T, C, D},

(1)

with shape parameter α_j > 0 and scale exp(λ_i_,_j), which has hazard and survival functions h_j(y | α_j, λ_i_,_j) = α_j exp(λ_i_,_j)y^α_j−1 and S_j(y | α_j, λ_i_,_j) = exp{− exp(λ_i_,_j)y^α_j} for y > 0. The joint conditional likelihood for all observable outcomes on 𝒴_D is the product of the individual likelihoods, $p (y_{i}^{i}, δ_{i} ∣ α, λ) = Π_{j \in {P, \dots, D}} {h_{j} (y_{j}^{o} ∣ α_{j}, λ_{i, j})}^{δ_{j}} S_{j} (y_{i, j}^{o} ∣ α_{j}, λ_{i, j})$ for observed ( $y_{i}^{o}$ , δ_i), (Prentice et al., 1978; Kalbfleisch and Prentice, 2011).

Given the form (1) of the marginals, we formulate a joint model to account for effects of dose and subgroup on each Y_ij, by including dose-subgroup parameters in each λ_i_,_j. Let u_i_,_j be a latent frailty associated with patient i for outcome j, with u_i = (u_i_,_P, ···, u_i_,_D). The relationship between Y_i_,_j, d_i, and Z_i = (Z_i, r_i) is based on the regression model

λ_{i, j} (Z_{i}, d_{i}, u_{i, j}) = β_{j} Z_{i} + ξ_{j} ψ_{r_{i}, d_{i}} + u_{i, j},

(2)

with ξ_D ≡ 1 to ensure identifiability. Thus, ψ_r_,_d is the effect of d on the death rate for disease type r. Since ψ_r_,_d describes the relationship between r, d, and Y_j for all j, it combines information across outcomes. Larger ψ_r_,_d implies a higher risk of death for patients with disease type r and dose d. If ξ_j > 0 (ξ_j < 0), this implies a higher (lower) risk for outcome j. The parameter β_j is the additive effect of HBD (Z = 1) on log(h_j), with β_j > 0 reflecting a higher rate, equivalently smaller E(Y_j), for HBD versus LBD. The regression model in (2) reduces the number of parameters, from 6×3×5 = 90 if ψ_Z_,_r_,_d_,_j were used to 5+9+4 = 18. The model is parsimonious to allow adaptive subgroup specific decision making to be done tractably even with the trial’s limited sample size, yet it is still quite flexible to accommodate possible relationships between d, Z and Y.

For the patient frailties, we assume $u_{i} \overset{iid}{\sim} N_{5} (0, Ω)$ with Ω ~ inverse-Wishart(ν,Ω⁰) for fixed ν > J − 1 and 5 × 5 positive definite hyper-parameter matrix Ω⁰. We incorporate {u_i, i = 1, ···, n} into the five hazard functions to account for possible heterogeneity between patients beyond that due to the prognostic subgroups. The correlations among the u_i_,_j’s also induce dependence among the outcomes of each patient. Combining (1) and (2), conditional on u_i_,_j, the hazard function j for Y_ij is

h_{j} (y ∣ α_{j}, ξ_{j}, ψ_{r_{i}, d_{i}}, Z_{i}, u_{i, j}) = α_{j} exp (β_{j} Z_{i} + ξ_{j} ψ_{r_{i}, d_{i}} + u_{i, j}) y^{α_{j} - 1}, y > 0,

and we assume conditional independence of the elements of Y_i given u_i and θ on the set on 𝒴_D. Suppressing patient index i, the joint survival function for y ∈ 𝒴_D is obtained by averaging over the frailty distribution,

S (y^{'}, y_{T}, y_{C}, y_{D} ∣ α, ξ, ψ_{r, d}, Z, ν, Ω^{0}) = \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} \prod_{j \in {P, R}} S_{j} (y^{'} ∣ α_{j}, ξ_{j}, ψ_{r, d}, u_{j}) \times \prod_{j \in {T, C, D}} S_{j} (y_{j} ∣ α_{j}, ξ_{j}, ψ_{r, d}, u_{j}) p (u, Ω ∣ ν, Ω^{0}) d u .

In this model, the hazard functions of all outcomes, including the competing risks P and R, are estimable (Prentice et al., 1978; Kalbfleisch and Prentice, 2011).

2.3 Prior Distributions and Posterior Computation

We specify priors for the model parameters α = (α_j, j = P, …, D), β = (β_j, j = P, …, D), ξ = (ξ_j, j = P,R, T,C), and ψ = (ψ_r_,_d, r = 1, …, K,d = 1, …, m) as follows. We assume $α_{j} \overset{indep}{\sim} Ga (a_{j}, b_{j})$ where Ga(a_j, b_j) represents gamma distributions with mean a_j/b_j and variance a_j/b_j², and let $ξ_{j} \overset{indep}{\sim} N ({\bar{ξ}}_{j}, ω_{j}^{2})$ for each j, with ξ_D ≡ 1. We assume $ψ_{r, d} \overset{indep}{\sim} N ({\bar{ψ}}_{r}, τ_{r}^{2})$ to allow the diseases CLL, ALL, NHL (r = 1, 2, 3) to have different outcome rates. To reflect higher hazards of adverse outcomes for patients with HBD, we assume that their effects follow normal distributions truncated below at 0, thus $p (β_{j}) \propto exp {- {(β_{j} - {\bar{β}}_{j})}^{2} / (2 σ_{j}^{2})}$ for β_j > 0, j = T,C,D, where β̄_j and $σ_{j}^{2}$ denote fixed hyperparameters for all j. The priors express no information on the directions of the HBD effects on the hazards of the sub-distributions of Y_i_,_P and Y_i_,_R, so $β_{j} ~ N ({\bar{β}}_{j}, σ_{j}^{2})$ for j = P,R. We denote θ = (α, β, ξ,ψ, Ω), and hyperparameter vector θ^* = (a, b, ξ̄,ω²,ψ̄, τ², β̄,σ², ν,Ω⁰) where (a, b) = {(a_j, b_j), j = P, …, D}, $(\bar{ξ}, ω^{2}) = {({\bar{ξ}}_{j}, ω_{j}^{2}), j = P, \dots, C}, (\bar{ψ}, τ^{2}) = {({\bar{ψ}}_{r}, τ_{r}^{2}), r = 1, \dots, K}, (\bar{β}, σ^{2}) = {({\bar{β}}_{j}, σ_{j}^{2}), j = P, \dots, D}$ .

To establish θ^*, we elicited probabilities of the 12 joint events occurring within follow of 30 days for C and 100 days for each j ≠ C (Table 1), chosen by the clinical investigators. Denote $L^{'} = (L_{P}^{'}, L_{R}^{'}, L_{T}^{'}, L_{C}^{'}, L_{D}^{'}) = (100, 100, 100, 30, 100)$ . We then solved sets of equations under the assumed model to obtain prior means, and calibrated dispersion parameters to reflect vague prior knowledge. Details of prior calibration are given in Supplementary §3.

Table 1.

Elicited probabilities $π_{j, Z, r}^{(e)}$ provided by the clinicians. The probabilities are used to establish values for fixed prior hyperparameters, ψ̄, β̄, ξ̄ and a.

Disease Bulk	Disease Type	Progression	Response	Severe Toxicity	Death
Low Bulk Disease (Z = 0)	CLL (r = 1)	0.05	0.20	0.25	0.02
	ALL (r = 2)	0.15	0.50	0.25	0.10
	NHL (r = 3)	0.10	0.35	0.25	0.05

High Bulk Disease (Z = 1)	CLL (r = 1)	0.20	0.35	0.25	0.10
	ALL (r = 2)	0.40	0.60	0.40	0.20
	NHL (r = 3)	0.40	0.40	0.25	0.15

Open in a new tab

Given θ^* and interim data 𝒟_n₍_t₎ at trial time t, the joint posterior of all parameters θ and patient specific random effects u = {u_i, i = 1, …, n(t)} is

p (θ, u ∣ D_{n (t)}, θ^{*}) \propto \prod_{i = 1}^{n (t)} p (y_{i}^{o}, δ_{i} ∣ θ, u_{i}) p (θ, u ∣ θ^{*}) = \prod_{i = 1}^{n (t)} \prod_{j \in {P, \dots, D}} {(h_{j} (y_{i, j}^{o} ∣ α_{j}, λ_{i, j}))}^{δ_{i, j} (t)} S_{j} (y_{i, j}^{o} ∣ α_{j}, λ_{i, j}) p (θ, u ∣ θ^{*}) .

(3)

We use Markov chain Monte Carlo (MCMC) simulation to generate posterior samples of θ and u. Details of posterior computation are given in Supplementary §1. A computer program “NKcelldosefinding” for implementing this methodology is available from https://users.soe.ucsc.edu/~juheelee/.

3 Computing Utilities

Denote $δ^{'} = (δ_{P}^{'}, δ_{R}^{'}, δ_{T}^{'}, δ_{C}^{'}, δ_{D}^{'})$ , where $δ_{j}^{'} = 1$ if Y_j is observed by $L_{j}^{'}$ , and 0 otherwise. Since $(δ_{P}^{'}, δ_{R}^{'}) = (1, 1)$ is impossible, there are 3×2³ = 24 possible outcomes δ′ ∈ Δ. Denote by Δ⁰ the subset of Δ with δ_D = 0, for patients who survive 100 days. As a practical approach, we elicited utilities on the set δ′ ∈ Δ⁰, rather than for all possible Y, by first fixing minimum utility U(δ′) = 0 if $δ_{D}^{'} = 1$ , i.e. if a patient dies before day 100. There are |Δ⁰| = 12 possible early event combinations in Δ⁰, so computation of the posterior mean utility for each (d,Z) only requires evaluation of π(δ′ | d,Z, θ) for each of the 12 indicator vectors δ′ ∈ Δ⁰. The elicited utilities U(δ′) for all δ′ ∈ Δ⁰ are given in Table 2.

Table 2.

Elicited utilities of all possible combinations of discrete outcomes for patients who survive 100 days, U(δ′) for δ′ with $δ_{D}^{'} = 0$ . For any δ′ with $δ_{D}^{'} = 1$ , U(δ′) = 0. The set Δ of all possible values does not include any δ′ with $(δ_{P}^{'}, δ_{R}^{'}) = 1 (1, 1)$ , since P and R are competing risks.

δ_{C}^{'}

δ_{T}^{'}

(

δ_{P}^{'}, δ_{R}^{'}

)

(1,0)

(0,0)

(0,1)

δ_{C}^{'}

δ_{T}^{'}

(

δ_{P}^{'}, δ_{R}^{'}

)

(1,0)

(0,0)

(0,1)

Open in a new tab

To compute mean utilities, we use the fact that the distribution of [δ′ | d,Z] is induced by that of [Y | d,Z], for each (d,Z). For example,

π ((1, 0, 1, 0, 0) ∣ d, Z, θ) = Pr (Y_{P} \leq L_{p}^{'}, Y_{R} > Y_{P}, Y_{T} \leq L_{T}^{'}, Y_{C} > L_{C}^{'}, Y_{D} > L_{D}^{'} ∣ d, Z, θ) = \int_{0}^{L_{P}^{'}} \int_{Y_{P}}^{\infty} \int_{0}^{L_{T}^{'}} \int_{L_{C}^{'}}^{\infty} \int_{L_{D}^{'}}^{\infty} \int_{ℝ^{5}} p (y ∣ d, Z, u, θ) p (u ∣ θ) d u d y .

Given θ, the mean utility of assigning dose d to a patient with covariates Z is

\bar{U} (d, Z, θ) = \sum_{δ^{'} \in Δ^{0}} U (δ^{'}) π (δ^{'} ∣ d, Z, θ) .

(4)

To estimate Ū(d,Z, θ), a frequentist approach might plug in an estimator, θ̂, and use Ū(d,Z, θ̂). We exploit the Bayesian structure to compute posterior predictive mean utilities for use as dose selection criteria. Given the final data, 𝒟_{N_max}, when all N_max patients have been followed up fully, for a future patient with covariates Z, the posterior predictive 9 mean utility of giving dose d to that patient is

u (d, Z ∣ D_{N_{max}}) = \int_{θ} \bar{U} (d, Z, θ) p (θ ∣ D_{N_{max}}) d θ .

(5)

This is used as an optimality criterion for dose selection at the end of the trial. While utilities are elicited over the early follow up intervals L′, all of the follow up information on (Y_i, δ_i), for i = 1, ···,N_max over L is used to compute u(d,Z | 𝒟_{N_max}), using a posterior MCMC sample of θ ~ p(θ | 𝒟_{N_max}, θ^*). Details are given in Supplementary §1.

4 Trial Design

To ensure ethical conduct of the NK cell trial, subgroup-specific safety monitoring rules are used. For each (d,Z), we denote $π_{D} (d, Z, θ) = Pr (Y_{D} \leq L_{D}^{'} ∣ d, Z, θ) = Pr (δ_{D}^{'} = 1 ∣ d, Z, θ)$ . Let π̄_D(Z) be an elicited fixed upper limit on π_D(d,Z, θ) for subgroup Z, and let p_D_,1 be a fixed cut-off probability. At trial time t, if

Pr {π_{D} (d, Z, θ) > {\bar{π}}_{D} (Z) ∣ D_{n (t)}} > p_{D, 1}

(6)

then d is considered unsafe for subgroup Z, and is no longer administered to patients in that subgroup. Elicited values of π̄_D(Z) are given in Table 3. To obtain a design with high subgroup-specific probabilities of stopping a truly unsafe dose and selecting the best safe dose for each Z, we investigated cut-offs 0.80 and 0.90, and chose p_D_,1 = 0.80.

Table 3.

Simulation Truth and Results. $π_{D}^{TR}$ = true probability of death within 100 days for each combination of disease type (r), dose level (d), disease bulk (Z), with π̄ (Z) = the fixed safety threshold. Unsafe doses are given in grey. Ū^TR = true expected utility for each (r, d, Z). Optimal doses are underlined.

Bulk (→)

LBD (Z = 0)

HBD (Z = 1)

Types (↓)

Dose (→)

d = 1

d = 2

d = 3

π̄_D

d = 1

d = 2

d = 3

π̄_D

Scenario 1

CLL (r = 1)

π_{D}^{TR}

0.02

0.15

0.10

0.30

Ū^TR

46.32

44.04

Pr(Stop)

0.00

Pr(Select)

0.33

0.37

0.30

0.33

0.34

0.33

ALL (r = 2)

π_{D}^{TR}

0.10

0.20

0.25

0.40

Ū^TR

50.52

37.97

Pr(Stop)

0.00

0.01

0.00

0.03

0.02

Pr(Select)

0.33

0.35

0.32

0.36

0.33

NHL (r = 3)

π_{D}^{TR}

0.05

0.20

0.15

0.40

Ū^TR

49.18

38.35

Pr(Stop)

0.00

Pr(Select)

0.33

0.34

0.33

0.35

0.31

0.34

Scenario 2

CLL (r = 1)

π_{D}^{TR}

0.02

0.45

0.60

0.15

0.04

0.70

0.84

0.30

Ū^TR

42.34

22.86

16.00

39.73

9.49

4.63

Pr(Stop)

0.00

0.89

0.97

0.00

0.98

1.00

Pr(Select)

1.00

0.00

1.00

0.00

ALL (r = 2)

π_{D}^{TR}

0.40

0.60

0.05

0.20

0.64

0.84

0.10

0.40

Ū^TR

23.81

15.82

40.41

10.99

4.62

36.02

Pr(Stop)

0.67

0.92

0.00

0.81

0.98

0.00

Pr(Select)

0.00

1.00

0.00

1.00

NHL (r = 3)

π_{D}^{TR}

0.65

0.05

0.35

0.20

0.88

0.10

0.58

0.40

Ū^TR

14.03

40.47

26.31

3.46

36.08

13.26

Pr(Stop)

0.95

0.00

0.55

1.00

0.00

0.68

Pr(Select)

0.00

1.00

0.00

1.00

0.00

Scenario 3

CLL (r = 1)

π_{D}^{TR}

0.42

0.38

0.37

0.15

0.66

0.62

0.60

0.30

Ū^TR

40.33

44.40

44.55

20.71

24.52

24.81

Pr(Stop)

0.88

0.85

0.82

0.96

0.95

0.94

Pr(Select)

0.07

0.11

0.13

0.03

0.05

ALL (r = 2)

π_{D}^{TR}

0.52

0.58

0.65

0.20

0.77

0.83

0.88

0.40

Ū^TR

33.99

29.43

24.52

14.34

11.01

7.57

Pr(Stop)

0.93

0.96

0.98

0.99

1.00

Pr(Select)

0.06

0.03

0.01

0.00

NHL (r = 3)

π_{D}^{TR}

0.40

0.42

0.45

0.20

0.64

0.67

0.70

0.40

Ū^TR

42.49

40.21

38.79

22.61

20.32

18.95

Pr(Stop)

0.73

0.77

0.84

0.85

0.87

0.94

Pr(Select)

0.19

0.16

0.07

0.12

0.11

0.04

Scenario 4

CLL (r = 1)

π_{D}^{TR}

0.01

0.10

0.25

0.15

0.01

0.11

0.27

0.30

Ū^TR

48.99

58.91

45.84

38.03

24.74

14.45

Pr(Stop)

0.00

0.26

0.00

0.13

Pr(Select)

0.04

0.95

0.01

1.00

0.00

ALL (r = 2)

π_{D}^{TR}

0.01

0.09

0.27

0.20

0.01

0.10

0.29

0.40

Ū^TR

48.82

58.66

43.95

37.54

26.47

14.55

Pr(Stop)

0.00

0.14

0.00

0.05

Pr(Select)

0.02

0.98

0.00

1.00

0.00

NHL (r = 3)

π_{D}^{TR}

0.01

0.08

0.30

0.20

0.01

0.09

0.33

0.40

Ū^TR

48.87

58.22

40.45

40.28

26.92

11.70

Pr(Stop)

0.00

0.29

0.00

0.12

Pr(Select)

0.01

0.99

0.00

1.00

0.00

Scenario 5

CLL (r = 1)

π_{D}^{TR}

0.01

0.09

0.30

0.15

0.01

0.12

0.38

0.30

Ū^TR

44.30

33.82

22.05

41.40

27.18

14.33

Pr(Stop)

0.00

0.48

0.00

0.44

Pr(Select)

1.00

0.00

1.00

0.00

ALL (r = 2)

π_{D}^{TR}

0.12

0.03

0.18

0.20

0.16

0.04

0.23

0.40

Ū^TR

30.68

40.15

27.96

23.62

35.88

20.52

Pr(Stop)

0.00

0.02

0.00

0.01

Pr(Select)

0.00

1.00

0.00

0.99

0.00

NHL (r = 3)

π_{D}^{TR}

0.10

0.15

0.01

0.20

0.13

0.20

0.01

0.40

Ū^TR

32.73

28.73

44.09

26.24

21.60

41.22

Pr(Stop)

0.00

0.01

0.00

Pr(Select)

0.00

1.00

0.00

1.00

Scenario 6

CLL (r = 1)

π_{D}^{TR}

0.35

0.03

0.13

0.15

0.75

0.10

0.37

0.30

Ū^TR

41.74

59.80

57.69

14.53

55.48

40.90

Pr(Stop)

0.76

0.00

0.09

0.99

0.00

0.34

Pr(Select)

0.00

0.56

0.44

0.00

0.99

0.01

ALL (r = 2)

π_{D}^{TR}

0.08

0.45

0.02

0.20

0.24

0.86

0.06

0.40

Ū^TR

57.75

34.95

57.83

47.19

7.81

55.07

Pr(Stop)

0.00

0.82

0.00

0.02

0.99

0.00

Pr(Select)

0.80

0.00

0.20

0.04

0.00

0.96

NHL (r = 3)

π_{D}^{TR}

0.05

0.10

0.30

0.20

0.16

0.29

0.70

0.40

Ū^TR

59.50

57.18

46.28

52.57

44.10

18.84

Pr(Stop)

0.00

0.01

0.51

0.00

0.03

0.91

Pr(Select)

0.53

0.47

0.01

0.89

0.11

0.00

Open in a new tab

The design is defined in terms of possible actions 𝒜 = {0, 1, 2, 3}, where any d ≥ 1 is a dose and d = 0 is the action to not administer any NK cells. Let 𝒜(Z,𝒟_n₍_t₎) be the subset of acceptable actions for a patient with covariates Z at trial time t based on interim data 𝒟_n₍_t₎. If no doses are safe for Z, i.e. 𝒜(Z,𝒟_n₍_t₎) = {0}, then no patient in subgroup Z is treated. The acceptable dose sets 𝒜(Z₁,𝒟_n₍_t₎) and 𝒜(Z₂,𝒟_n₍_t₎) may differ for Z₁ ≠ Z₂ at time t during the trial, and these sets may change adaptively as data accumulate.

During trial conduct, for each disease type r, patients are randomized among the three doses in order of entry to the trial by randomly permuting the integers (1, 2, 3). Safety monitoring is begun for each disease type r when nine patients have been enrolled in r and at least five of the nine have died or been followed for 100 days. For each disease type r, we define action sets, 𝒜((0, r), t) and 𝒜((1, r), t), for the two disease bulk subgroups. E.g., suppose the initial permuted dose blocks are (3, 1, 2), for r = 1, (3, 2, 1) for r = 2, and (2, 1, 3) for r = 3. Once safety monitoring is begun, unsafe doses are eliminated from each block adaptively. For example, if the design gives doses (3, 1, 2) for a cohort with disease type r, the following two possible cases illustrate details of trial conduct.

Suppose the first patient in the cohort has Z = (0, r). We first update 𝒜((0, r), t). If the updated 𝒜((0, r), t) = {0}, i.e. no dose is safe for this subgroup, we do not give any NK cells to the patient. If 3 ∈ 𝒜(Z, t), the patient is treated at d = 3. If not, we move on to a dose in the permutation that has not been used and is safe for Z.
Suppose d = 3 is given to the previous patient in the cohort, and the next patient in the cohort has Z = (1, r). We update 𝒜(1, r) based on the most recent data. If 𝒜((1, r), t) = {0}, we do not give any NK cells to the patient. If 1 ∈ 𝒜((1, r), t), then we give d = 1 to the patient. If not, we proceed to d = 2. If d = 2 is not safe, then d = 3 is the only safe dose for Z = (1, r), since all doses in (3, 1, 2) have been used. A new cohort is started by randomly permuting (1, 2, 3). Suppose this gives (1, 3, 2). Since d = 1 must be skipped since it is not safe, d = 3 is given to the patient. At this point, only d = 2 is left in that block for the next patient with disease type r.

An additional rule imposed by a regulatory agency also is included. The regulator required a safety rule that ignores Z and stops the trial if the probability of death within 30 days at d = 1 is too high. To comply with this requirement, we formulated a simplified model for this safety rule only, assuming δ_D(30) | q_D ~ Ber(q_D), where q_D(30) = Pr(Y_D < 100 | d = 1) for all Z, and q_D ~ Be(0.4, 0.6) a priori. If Pr(q_D > .40 | 𝒟_n₍_t₎) > .90, then this rule stops the trial and concludes that no dose is safe for any patient. Thus, the trial can be stopped either by the regulator’s rule or the subgroup-specific safety rules.

To determine a final optimal action for each Z when N_max = 60 at T_max, we identify 𝒜(Z, T_max) using the safety rule in (6). If 𝒜(Z, T_max) = {0}, then no dose is selected for Z, denoted by d_sel(Z) = 0. If 𝒜(Z, T_max) ≠ {0}, then the selected optimal dose is

d_{sel} (Z) = arg max_{d \in A (Z, T_{max})} u (d, Z ∣ D_{N_{max}}) .

5 Simulation Study

5.1 Simulation Design

We simulated the NK cell trial under six scenarios to evaluate the design’s performance. For Scenario 1, we used the prior occurrence probabilities { $π_{j, Z, r}^{(e)}$ elicited from the clinicians in Table 1 to simulate data, with fixed “true” parameter values $α_{j}^{TR}$ and ${\bar{λ}}_{j, Z, r, d}^{TR}$ determined by solving the equations

π_{j, Z, r}^{(e)} = 1 - S (L_{j}^{'} ∣ α_{j}^{TR}, {\bar{λ}}_{j, Z, r, d}^{TR}) = 1 - exp {- exp ({\bar{λ}}_{j, Z, r, d}^{TR}) {(L_{j}^{'})}^{α_{j}^{TR}}}

(7)

for (j,Z, r). In Scenario 1, we assumed dose has no effect, with the same ${\bar{λ}}_{j, Z, r, d}^{TR}$ for all d and no regression relationship in (2) for λ̄^TR. We simulated data from the Weibull distribution,

y_{i, j} ∣ α_{j}^{TR}, {\bar{λ}}_{j, Z_{1 i}, r_{i}, d_{i}}^{TR}, u_{i, j}^{TR} \overset{indep}{\sim} Weibull (α_{j}^{TR}, exp ({\bar{λ}}_{j, Z_{1 i}, r_{i}, d}^{TR} + u_{i, j}^{TR})),

where $u_{i}^{TR} \overset{iid}{\sim} N_{5} (0, Ω^{TR})$ , with $Ω_{j, j}^{TR} = 0.001, Ω_{j, R}^{TR} = - 0.5 \times 0.001$ , j ≠ R and $Ω_{j, j^{'}}^{TR} = 0.5 \times 0.001$ , j ≠ j′, j, j′ ≠ R. For Scenarios 2–6, we assumed the same Ω^TR and specified α^TR and true marginal probability of death by $L_{D}^{'} = 100$ for LBD (Z = 0), disease type r and dose d, $π_{D, 0, r, d}^{TR} = Pr (Y_{D} \leq L_{D}^{'} ∣ d, Z = 0, r)$ . The survival function in (7) with $α_{D}^{TR}$ and $π_{D, 0, r, d}^{TR}$ gives ${\bar{λ}}_{D, 0, r, d}^{TR}$ . For subgroups with Z = 1 and the other outcomes, we specified ${\bar{ξ}}_{j}^{TR}$ and simulated ${\bar{ξ}}_{j, r, d}^{TR} \overset{indep}{\sim} N ({\bar{ξ}}_{j}^{TR}, {0.01}^{2})$ for all combinations of (j, r, d) with $ξ_{D, r, d}^{TR} = 1$ . Similarly, we specified ${\bar{β}}_{j}^{TR}$ and simulated ${\bar{β}}_{j, r, d}^{TR} \overset{indep}{\sim} N ({\bar{ξ}}_{j}^{TR}, {0.01}^{2})$ , j = P,R, and $log (β_{j, r, d}^{TR}) \overset{indep}{\sim} N (log ({\bar{β}}_{j}^{TR}), {0.05}^{2})$ , j = T,C,D, to ensure that $β_{j, r, d}^{TR} > 0$ for adverse outcomes. We set ${\bar{λ}}_{j, Z, r, d}^{TR} = β_{j, r, d}^{TR} Z + ξ_{j, r, d}^{TR} ψ_{D, r, d}^{TR}$ and $λ_{i, j}^{T R} = {\bar{λ}}_{j, Z_{i}, r_{i}, d}^{TR} + u_{i, j}^{TR}$ . We generated event time Y_i,j for a patient with Z_i from $Weibull (α_{j}^{TR}, λ_{i, j}^{TR})$ . Under the model assumed for the simulation truth, β^TR and ξ^TR are indexed by j, r and d. This more complex model includes the design’s assumed Weibull model as a special case by letting $β_{j, r, d}^{TR} = β_{j}$ and $ξ_{j, r, d}^{TR} = ξ_{j}$ for all (r, d). The assumed true probabilities of death, $π_{D, Z, r, d}^{TR}$ for each (Z, r, d), are shown on the first lines of the simulation scenario boxes in Table 3, with the probabilities exceeding the subgroup-specific upper limits π̄_D(Z) marked in grey. The second lines give the true expected utilities Ū^TR for each (Z, d), and the maximum utility for each Z is underlined. For example, all doses are safe for all Z under Scenario 1, while all doses unsafe for all Z under Scenario 3. Under Scenario 1, all doses are equally good, while under Scenarios 2, 4, 5, and 6, the optimal safe doses vary with Z and using patient subgroup information is critical. Under Scenario 4, doses 2 and 1 are optimal for patients with Z = 0 and Z = 1, respectively, regardless of r. Under Scenario 5, the optimal doses vary with disease type r but not with disease bulk Z. Under Scenario 6, the true mean utilities vary with (d, Z, r), and the set of acceptable doses varies with Z. Values of α^TR, β̄^TR, ξ̄^TR assumed for the scenarios are given in Supplementary Table 2. A total of M = 1000 trials were simulated under each scenario.

We examined the design’s robustness by simulating the Y_i,j’s from a log-logistic distribution. To obtain fair comparisons, in each scenario, given $π_{j, Z, r, d}^{TR}$ values under the Weibull distribution we solved for true parameter values under the log-logistic distribution by matching the $π_{j, Z, r, d}^{TR}$ ’s, so the marginal probabilities of occurrence during follow-up were identical for the two models, and truly unsafe doses remained unsafe regardless of the assumed true model used to simulate the data. In contrast, the rates of occurrence over time under the two models necessarily differed, which caused the mean utilities to change, due to P and R being competing risks and the semi-competing risks structure between D and the other outcomes on 𝒴_D. The true mean utilities under the log-logistic are given in Supplementary Table 6. Most differences in true mean utilities are minor, although in some cases there are non-trivial differences. For example, under the Weibull distribution d = 1 clearly is optimal for Z = 1 in Scenario 4, but under the log-logistic distribution differences between expected utilities for the three doses are much smaller. Details of the simulation design are given in Supplementary §3.

As a comparator, we considered a simplified version of the design that does not use covariates or make subgroup-specific inferences. For this design, we assumed a simpler model ignoring Z but still accounting for the five event times and their competing risks and semi-competing risks relationships. This model assumes each $Y_{i, j} ∣ α_{j}^{'}, λ_{i, j}^{'} \overset{indep}{\sim} Weibull (α_{j}^{'}, exp (λ_{i, j}^{'}))$ , where $λ_{i, j}^{'} = ξ_{j}^{'} ψ_{d_{i}}^{'} {= u}_{i, j}^{'}$ with $ξ_{D}^{'} = 1$ and $u_{i}^{'} \overset{iid}{\sim} N_{5} (0, Ω)$ . Similar to the full model, for the simpler model we assumed a gamma prior for $α_{j}^{'}$ , normal priors for $ξ_{j}^{'}$ and $ψ_{d}^{'}$ and an inverse Wishart prior for Ω. Under the simpler design, we defined 𝒜(t) as a function of t only, so if a dose was declared unsafe this pertained to all patients. A similar randomization with blocks of size m = 3 was used for allocating patients to doses in 𝒜(t), and a dose d was declared unsafe if $Pr (π_{D}^{'} (d^{'}, θ^{'}) > .25 ∣ D_{n (t)}) > .80$ , where $π_{D}^{'} (d^{'}, θ^{'})$ is the probability of death by $L_{D}^{'}$ with d and θ′ = (α′, ψ′, ξ′). Fixed prior hyperparameters under the simpler model were specified, by using the elicited probabilities in Table 1, but ignoring any effects of Z. Posterior mean utility, for each d but ignoring Z, was used as a criterion to choose an optimal dose for any future patient.

We evaluated the designs using two criteria, the probabilities of identifying doses with truly excessive probabilities of death and of selecting the true optimal safe dose for each Z. For each simulated trial ℓ = 1, …, M, and each Z, each design selects a dose d_sel_,_ℓ(Z), with d_sel_,_ℓ(Z) ≡ d_sel_,_ℓ for all Z under the simpler design. We let κ₁_,_ℓ(d,Z) = 1 if dose d is identified as unsafe for a patient with Z in simulated trial ℓ, or 0 if not. We let κ₂_,_ℓ = 1 if trial ℓ is not terminated by the regulator’s safety rule, and 0 otherwise. For I(·) the indicator function, we summarized simulation results using the empirical proportions among trials not stopped by the regulator’s safety rule, given for each d and Z by

Pr (Stop ∣ d, Z) = \frac{\sum_{ℓ = 1}^{M} κ_{2, ℓ} κ_{1, ℓ} (d, Z)}{\sum_{ℓ = 1}^{M} κ_{2, ℓ}} and Pr (Select ∣ d, Z) = \frac{\sum_{ℓ = 1}^{M} κ_{2, ℓ} I (d = d_{sel, ℓ} (Z))}{\sum_{ℓ = 1}^{M} κ_{2, ℓ}} .

5.2 Simulation Results

Simulation results are summarized in Table 3, including the simulation truth to facilitate evaluation, with Pr(Stop | d,Z) and Pr(Select | d,Z) shown in the third and fourth lines for each Z. Overall, the design reliably identifies unsafe doses and selects optimal doses for each subgroup, based on N_max = 60. Large Pr(Stop | d,Z) is achieved for Z and d with large $π_{D}^{TR}$ . When $π_{D}^{TR}$ is clearly greater than π̄_D(Z), Pr(Stop | d,Z) is particularly high, such as in the cases with r = 1 in Scenario 3. Cases where $π_{D, Z, r}^{TR}$ is slightly greater than π̄_D(Z) tend not to achieve high Pr(Stop | d,Z), in part due to the small sub-sample size per subgroup. The design makes more accurate decisions for Z = 1 compared to Z = 0, due to the prevalences Pr(Z = 1) = 2/3 and Pr(Z = 0) = 1/3. For example, in Scenario 3, $π_{D}^{TR}$ exceeds π̄_D by approximately the same difference for Z = (0, 1) and Z = (1, 1), but Pr(Stop) is much larger for Z = (1, 1) due to there being more data on HBD patients (Z = 1). Truly optimal safe doses have large Pr(Select | d,Z), shown on the fourth lines for each Z in the table. In Scenario 1, doses have the same true expected utilities for each Z, and doses are selected with almost equal probabilities for all (Z, r). When there are clearly optimal doses, as in Scenario 2, the design has large Pr(Select | d,Z) for those doses. When two doses have similar expected utilities, such as cases with Z = 0 in Scenario 6, the design selects both doses with large Pr(Select | d,Z). When no dose is safe, as in Scenario 3, Pr(Select | d,Z) is small for all d. Scenario 6 is complex in that the pattern of the true utilities varies with both Z and r. The design captures this pattern well and makes correct decisions with high probabilities.

The proportions of patients treated in trials for each dose and subgroup are summarized in Supplementary Table 5, showing the design reliably identifies unsafe doses for each subgroup and assigns fewer patients to doses declared unsafe. If all doses are identified as safe for a subgroup, patients in the subgroup will be assigned to a dose at random, as in Scenario 1. Proportions of trials terminated by the regulator’s safety rule are summarized in Supplementary Table 7. The regulator’s safety rule rarely terminates the trial, even for Scenario 3 where $π_{D}^{TR}$ exceeds q̄_D for all Z. This is because, under Scenario 3, a trial is terminated earlier by the subgroup-specific safety rule since all doses are unsafe for all subgroups. When d = 1 is unsafe, it is likely that the subgroup-specific safety rule identifies this and stops further allocation of patients to d = 1, so no more deaths occur at d = 1. This helps to prevent the regulator’s safety rule from incorrectly terminating the entire trial when only d = 1 is unsafe, thus continuing accrual for safe doses and improving evaluation of outcomes for those doses.

Stopping and dose selection probabilities under the simpler design that ignores Z are compared to those under the design with subgroup specific decisions in Figure 1 and summarized in Supplementary Tables 3 and 4. Panels (a) and (b) of the figure give histograms of the differences, Pr(Stop | d,Z) − Pr(Stop | d), for truly safe doses and unsafe doses, respectively, for all (d,Z) and all scenarios. Panel (a) shows that the design accounting for Z often has much smaller Pr(Stop | d,Z) for truly safe doses (thus negative differences). This advantage is substantial in cases like Scenario 2, where the true safety of a dose varies greatly across subgroups, and most doses are unsafe. The histogram in (b) shows that the simpler design often stops truly unsafe doses with higher probability (thus negative differences). This is mainly due to Scenario 3 where all doses are unsafe for all subgroups. However, when a dose is unsafe only for some subgroups, as in Scenarios 4 and 5, the design accounting for Z greatly increases the probability of correctly stopping truly unsafe doses, shown by the large cluster above 0.40. Panel (c) gives the histogram of differences in empirical proportions Pr(select | d,Z) − Pr(select | d), for truly optimal doses for all (d,Z) across Scenarios 2–6. The design accounting for Z is much more likely to select truly optimal doses (thus many more positive differences). Doses assigned to patients under the simpler design are summarized in Supplementary Table 5. The table shows that when unsafe doses vary between subgroups, as in Scenario 2, more patients are treated at unsafe doses under the simpler design. A detailed discussion is given in Supplementary §4.

Comparison to the simpler design ignoring subgroups (Z): Panels (a) and (b) give histograms of differences in empirical proportions Pr(Stop | d,Z) − Pr(Stop | d) for truly safe doses and unsafe doses, respectively, for all (d,Z) and all scenarios. Panel (c) gives the histogram of differences in empirical proportions of correctly selecting a truly optimal dose, Pr(Select | d,Z) − Pr(Select | d) for all Z and all scenarios.

Results of the robustness study are summarized in Supplementary Table 6. When data are simulated from the log-logistic distribution, some correct stopping and selection probabilities are decreased under Scenarios 3 and 4 in the LBD subgroups, since Pr(LBD) = 1/3. In the other cases, differences in the patterns of the hazard functions over time affect the design’s performance only slightly. Thus, the design appears to be robust.

To examine how much the design’s performance is improved by a larger sample size, we re-ran the simulations using N_max = 120. Supplementary Tables 5 and 8 illustrate the results for N_max = 120 under all scenarios, showing that subgroup-specific dose assignments are improved (Table 5) and probabilities of correctly stopping unsafe doses and selecting optimal doses are greatly increased for many combinations of (d,Z) (Table 8). For example, in Scenario 6, for patients with NHL, Pr(Stop | d,Z) values increase from 0.51 to 0.67 (LBD) and from 0.91 to 0.99 (HBD) for d = 3. For subgroups with HBD, the truly optimal doses are selected with higher rates for N_max = 120. It thus appears that a larger N_max is highly desirable for designs making subgroup-specific decisions.

We also investigated how the performance of the proposed design changes with shorter follow up, by reducing follow up from L = (365, 365, 100, 100, 365) to L′ = (100, 100, 100, 30, 100). The results are summarized in Supplementary Table 9. The design’s performance greatly deteriorates with shorter follow up, on average, for both Pr(Stop | d,Z) and Pr(Select | d,Z). For example, the probability that dose 1 is correctly identified as unsafe decreases from 0.76 to 0.53 for (CLL, LBD) in Scenario 6. Thus, incorporating data from patients monitored for a longer period greatly enhances the design’s performance.

6 Discussion

We have presented a clinical trial design that does subgroup-specific safety monitoring and dose selection for a clinical trial of NK cells as therapy for severe hematologic diseases. Decisions are based on five time-to-event outcomes by formulating a utility-based dose optimization criterion. Our simulations show that the design performs well under a wide variety of dose-subgroup-outcome scenarios, and that accounting for patient heterogeneity in this setting is very important, since failure to do so is likely to produce a design with extremely large incorrect decision probabilities in many subgroups. The results in Supplementary Table 8 strongly suggest that trials that make subgroup-specific decisions should have larger sample sizes than conventional trials. A general conclusion is that phase I–II designs should do precision medicine and have larger sample sizes than used conventionally.

To apply this methodology if some outcome hazards are known to increase with dose, a monotonicity assumption would be needed, and the block randomization would be replaced by a sequentially adaptive within-subgroup dose assignment procedure. The trial would be more difficult to conduct, since the resulting imbalance in dose-subgroup sample sizes would reduce reliability. If the prevalence for one or more subgroups is very low, the proposed design may not be feasible due to unacceptably small sample sizes for those subgroups. Thus, it may be more appropriate to exclude rare subgroups.

Our use of five outcomes shows large variability in the elicited utilities of the 12 possible nonfatal elementary events in Table 2. If R were considered to be efficacy and toxicity were defined as any of the four adverse events, this would combine adverse events with utilities ranging from 0 to 70. Thus, our more refined utility structure appears warranted.

Supplementary Material

NIHMS938910-supplement-supplement_1.pdf^{(338.1KB, pdf)}

Acknowledgments

Peter Thall’s research was supported by NIH/NCI grants 5R01 CA083932 and 1R01 CA211044. Katy Rezvani’s research was supported by NIH/NCI grant 1R01 CA211044. Juhee Lee’s research was supported by NSF grant DMS-1662427.

Footnotes

Supplementary Materials

Supplementary materials are available under the Paper Information link at the journal website.

Contributor Information

Juhee Lee, Department of Applied Mathematics and Statistics, University of California at Santa Cruz, Santa Cruz, CA.

Peter F. Thall, Department of Biostatistics, M.D. Anderson Cancer Center, Houston, TX

Katy Rezvani, Department of Stem Cell Transplantation and Cellular Therapy, M.D. Anderson Cancer Center, Houston, TX.

References

Chen G, Zeng D, Kosorok MR. Personalized dose finding using outcome weighted learning. Journal of the American Statistical Association. 2016;111(516):1509–1521. doi: 10.1080/01621459.2016.1148611. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cheung YK. Dose finding by the continual reassessment method. CRC Press; 2011. [Google Scholar]
Cheung YK, Chappell R. Sequential designs for phase I clinical trials with late-onset toxicities. Biometrics. 2000;56(4):1177–1182. doi: 10.1111/j.0006-341x.2000.01177.x. [DOI] [PubMed] [Google Scholar]
Fine JP, Jiang H, Chappell R. On semi-competing risks data. Biometrika. 2001:907–919. [Google Scholar]
Gorfine M, Hsu L. Frailty-based competing risks model for multivariate survival data. Biometrics. 2011;67(2):415–426. doi: 10.1111/j.1541-0420.2010.01470.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Guo B, Yuan Y. Bayesian phase I/II biomarker-based dose finding for precision medicine with molecularly targeted agents. Journal of the American Statistical Association. 2017;112(518):508–520. doi: 10.1080/01621459.2016.1228534. [DOI] [PMC free article] [PubMed] [Google Scholar]
Houede N, Thall PF, Nguyen H, Paoletti X, Kramar A. Utility-based optimization of combination therapy using ordinal toxicity and efficacy in phase I/II trials. Biometrics. 2010;66(2):532–540. doi: 10.1111/j.1541-0420.2009.01302.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jin IH, Liu S, Thall PF, Yuan Y. Using data augmentation to facilitate conduct of phase I–II clinical trials with delayed outcomes. Journal of the American Statistical Association. 2014;109(506):525–536. doi: 10.1080/01621459.2014.881740. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Vol. 360. John Wiley & Sons; 2011. [Google Scholar]
Lee KH, Dominici F, Schrag D, Haneuse S. Hierarchical models for semicompeting risks data with application to quality of end-of-life care for pancreatic cancer. Journal of the American Statistical Association. 2016;111(515):1075–1095. doi: 10.1080/01621459.2016.1164052. [DOI] [PMC free article] [PubMed] [Google Scholar]
Murray TA, Thall PF, Yuan Y. Utility-based designs for randomized comparative trials with categorical outcomes. Statistics in medicine. 2016;35(24):4285–4305. doi: 10.1002/sim.6989. [DOI] [PMC free article] [PubMed] [Google Scholar]
Prentice RL, Kalbfleisch JD, Peterson AV, Jr, Flournoy N, Farewell V, Breslow N. The analysis of failure times in the presence of competing risks. Biometrics. 1978:541–554. [PubMed] [Google Scholar]
Rezvani K, Rouce RH. The application of natural killer cell immunotherapy for the treatment of cancer. Frontiers in immunology. 2015;6:578. doi: 10.3389/fimmu.2015.00578. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ripatti S, Palmgren J. Estimation of multivariate frailty models using penalized partial likelihood. Biometrics. 2000;56(4):1016–1022. doi: 10.1111/j.0006-341x.2000.01016.x. [DOI] [PubMed] [Google Scholar]
Thall PF, Nguyen HQ, Braun TM, Qazilbash MH. Using joint utilities of the times to response and toxicity to adaptively optimize schedule–dose regimes. Biometrics. 2013;69(3):673–682. doi: 10.1111/biom.12065. [DOI] [PMC free article] [PubMed] [Google Scholar]
Thall PF, Nguyen HQ, Estey EH. Patient-specific dose-finding based on bivariate outcomes and covariates. Biometrics. 2008;64(4):1126–1136. doi: 10.1111/j.1541-0420.2008.01009.x. [DOI] [PubMed] [Google Scholar]
Yuan Y, Nguyen HQ, Thall PF. Bayesian Designs for Phase I–II Clinical Trials. Chapman & Hall/CRC; New York: 2016. [Google Scholar]
Zhou Y, Whitehead J, Bonvini E, Stevens JW. Bayesian decision procedures for binary and continuous bivariate dose-escalation studies. Pharmaceutical statistics. 2006;5(2):125–133. doi: 10.1002/pst.222. [DOI] [PubMed] [Google Scholar]
Zohar S, Chevret S. Recent developments in adaptive designs for phase I/II dose-finding studies. Journal of biopharmaceutical statistics. 2007;17(6):1071–1083. doi: 10.1080/10543400701645116. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS938910-supplement-supplement_1.pdf^{(338.1KB, pdf)}

[R1] Chen G, Zeng D, Kosorok MR. Personalized dose finding using outcome weighted learning. Journal of the American Statistical Association. 2016;111(516):1509–1521. doi: 10.1080/01621459.2016.1148611. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Cheung YK. Dose finding by the continual reassessment method. CRC Press; 2011. [Google Scholar]

[R3] Cheung YK, Chappell R. Sequential designs for phase I clinical trials with late-onset toxicities. Biometrics. 2000;56(4):1177–1182. doi: 10.1111/j.0006-341x.2000.01177.x. [DOI] [PubMed] [Google Scholar]

[R4] Fine JP, Jiang H, Chappell R. On semi-competing risks data. Biometrika. 2001:907–919. [Google Scholar]

[R5] Gorfine M, Hsu L. Frailty-based competing risks model for multivariate survival data. Biometrics. 2011;67(2):415–426. doi: 10.1111/j.1541-0420.2010.01470.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Guo B, Yuan Y. Bayesian phase I/II biomarker-based dose finding for precision medicine with molecularly targeted agents. Journal of the American Statistical Association. 2017;112(518):508–520. doi: 10.1080/01621459.2016.1228534. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Houede N, Thall PF, Nguyen H, Paoletti X, Kramar A. Utility-based optimization of combination therapy using ordinal toxicity and efficacy in phase I/II trials. Biometrics. 2010;66(2):532–540. doi: 10.1111/j.1541-0420.2009.01302.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Jin IH, Liu S, Thall PF, Yuan Y. Using data augmentation to facilitate conduct of phase I–II clinical trials with delayed outcomes. Journal of the American Statistical Association. 2014;109(506):525–536. doi: 10.1080/01621459.2014.881740. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Vol. 360. John Wiley & Sons; 2011. [Google Scholar]

[R10] Lee KH, Dominici F, Schrag D, Haneuse S. Hierarchical models for semicompeting risks data with application to quality of end-of-life care for pancreatic cancer. Journal of the American Statistical Association. 2016;111(515):1075–1095. doi: 10.1080/01621459.2016.1164052. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Murray TA, Thall PF, Yuan Y. Utility-based designs for randomized comparative trials with categorical outcomes. Statistics in medicine. 2016;35(24):4285–4305. doi: 10.1002/sim.6989. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] Prentice RL, Kalbfleisch JD, Peterson AV, Jr, Flournoy N, Farewell V, Breslow N. The analysis of failure times in the presence of competing risks. Biometrics. 1978:541–554. [PubMed] [Google Scholar]

[R13] Rezvani K, Rouce RH. The application of natural killer cell immunotherapy for the treatment of cancer. Frontiers in immunology. 2015;6:578. doi: 10.3389/fimmu.2015.00578. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Ripatti S, Palmgren J. Estimation of multivariate frailty models using penalized partial likelihood. Biometrics. 2000;56(4):1016–1022. doi: 10.1111/j.0006-341x.2000.01016.x. [DOI] [PubMed] [Google Scholar]

[R15] Thall PF, Nguyen HQ, Braun TM, Qazilbash MH. Using joint utilities of the times to response and toxicity to adaptively optimize schedule–dose regimes. Biometrics. 2013;69(3):673–682. doi: 10.1111/biom.12065. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Thall PF, Nguyen HQ, Estey EH. Patient-specific dose-finding based on bivariate outcomes and covariates. Biometrics. 2008;64(4):1126–1136. doi: 10.1111/j.1541-0420.2008.01009.x. [DOI] [PubMed] [Google Scholar]

[R17] Yuan Y, Nguyen HQ, Thall PF. Bayesian Designs for Phase I–II Clinical Trials. Chapman & Hall/CRC; New York: 2016. [Google Scholar]

[R18] Zhou Y, Whitehead J, Bonvini E, Stevens JW. Bayesian decision procedures for binary and continuous bivariate dose-escalation studies. Pharmaceutical statistics. 2006;5(2):125–133. doi: 10.1002/pst.222. [DOI] [PubMed] [Google Scholar]

[R19] Zohar S, Chevret S. Recent developments in adaptive designs for phase I/II dose-finding studies. Journal of biopharmaceutical statistics. 2007;17(6):1071–1083. doi: 10.1080/10543400701645116. [DOI] [PubMed] [Google Scholar]

PERMALINK

Optimizing natural killer cell doses for heterogeneous cancer patients on the basis of multiple event times

Juhee Lee

Peter F Thall

Katy Rezvani

Abstract

1 Introduction

2 Probability Model

2.1 Recording Event Times

2.2 Sampling and Frailty Models

2.3 Prior Distributions and Posterior Computation

Table 1.

3 Computing Utilities

Table 2.

4 Trial Design

Table 3.

5 Simulation Study

5.1 Simulation Design

5.2 Simulation Results

Figure 1.

6 Discussion

Supplementary Material

Acknowledgments

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Optimizing natural killer cell doses for heterogeneous cancer patients on the basis of multiple event times

Juhee Lee

Peter F Thall

Katy Rezvani

Abstract

1 Introduction

2 Probability Model

2.1 Recording Event Times

2.2 Sampling and Frailty Models

2.3 Prior Distributions and Posterior Computation

Table 1.

3 Computing Utilities

Table 2.

4 Trial Design

Table 3.

5 Simulation Study

5.1 Simulation Design

5.2 Simulation Results

Figure 1.

6 Discussion

Supplementary Material

Acknowledgments

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases