Skip to main content
Annals of Oncology logoLink to Annals of Oncology
. 2019 Feb 25;30(4):542–550. doi: 10.1093/annonc/mdz053

Application of a sequential multiple assignment randomized trial (SMART) design in older patients with chronic lymphocytic leukemia

A S Ruppert 1,2,, J Yin 3, M Davidian 4, A A Tsiatis 4, J C Byrd 1, J A Woyach 1, S J Mandrekar 3
PMCID: PMC6735877  PMID: 30799502

Abstract

Background

Ibrutinib therapy is safe and effective in patients with chronic lymphocytic leukemia (CLL). Currently, ibrutinib is administered continuously until disease progression. Combination regimens with ibrutinib are being developed to deepen response which could allow for ibrutinib maintenance (IM) discontinuation. Among untreated older patients with CLL, clinical investigators had the following questions: (i) does ibrutinib + venetoclax + obinutuzumab (IVO) with IM have superior progression-free survival (PFS) compared with ibrutinib + obinutuzumab (IO) with IM, and (ii) does the treatment strategy of IVO + IM for patients without minimal residual disease complete response (MRD- CR) or IVO + IM discontinuation for patients with MRD- CR have superior PFS compared with IO + IM.

Design

Conventional designs randomize patients to IO with IM or IVO with IM to address the first objective, or randomize patients to each treatment strategy to address the second objective. A sequential multiple assignment randomized trial (SMART) design and analysis is proposed to address both objectives.

Results

A SMART design strategy is appropriate when comparing adaptive interventions, which are defined by an individual’s sequence of treatment decisions and guided by intermediate outcomes, such as response to therapy. A review of common applications of SMART design strategies is provided. Specific to the SMART design previously considered for Alliance study A041702, the general structure of the SMART is presented, an approach to sample size and power calculations when comparing adaptive interventions embedded in the SMART with a time-to-event end point is fully described, and analyses plans are outlined.

Conclusion

SMART design strategies can be used in cancer clinical trials with adaptive interventions to identify optimal treatment strategies. Further, standard software exists to provide sample size, power calculations, and data analysis for a SMART design.

Keywords: SMART, CLL, design strategies, adaptive interventions, randomized clinical trials


Key Message

Sequential multiple assignment randomized trial (SMART) designs are applicable to clinical trials in which treatment decisions are guided by intermediate patient outcomes. A randomized phase III study in chronic lymphocytic leukemia is used to illustrate a SMART design, sample size and power calculations for a time-to-event end point, and analyses plans to identify an optimal treatment strategy.

Introduction

Chronic lymphocytic leukemia (CLL) is the most prevalent adult leukemia, with a median age of diagnosis of 72 years [1]. It is incurable outside of allogeneic stem cell transplant, which often is not an option for older patients [2].

Ibrutinib is a Bruton’s tyrosine kinase inhibitor approved by the US Food and Drug Administration as a first-line treatment of patients with CLL. In a randomized phase III study comparing ibrutinib to chlorambucil in older patients (RESONATE-2: ClinicalTrials.gov NCT01722487), ibrutinib significantly extended progression-free survival (PFS) and overall survival (OS) (P <0.01 for both end points) [3]. Favorable long-term outcome of ibrutinib in previously untreated CLL patients was also observed in a pilot phase II study [4].

Since response to therapy improves with longer duration of ibrutinib administration and the majority of patients achieve partial responses, ibrutinib currently follows a continuous dosing regimen until disease progression [3, 5]. Ibrutinib combined with CD20 monoclonal antibodies such as rituximab and obinutuzumab is safe and is expected to be at least as effective as single-agent ibrutinib [6, 7]. In a randomized phase III study comparing ibrutinib + rituximab versus ibrutinib alone versus bendamustine + rituximab in untreated older patients (Alliance for Clinical Trials in Oncology A041202: Clinicaltrials.gov NCT01886872), both ibrutinib-based regimens significantly extended PFS (P <0.01), but adding rituximab to ibrutinib did not significantly extend PFS compared with ibrutinib alone [8]. Whether or not results would be altered if adding a different CD20 antibody, such as obinutuzumab with ibrutinib, is undetermined.

Combining ibrutinib with additional targeted therapies will likely be required to achieve minimal residual negative disease with complete remission (MRD- CR) in a substantial number of patients to allow discontinuation of therapy [8]. Preclinical data suggest synergy between the B-cell lymphocyte 2 (Bcl-2) inhibitor venetoclax and ibrutinib [9]. Venetoclax as a single agent induces complete remission in 20% of patients with relapsed CLL [10], and the combination of ibrutinib + venetoclax + obinutuzumab (IVO) is currently under investigation in the phase I/II setting in patients with relapsed/refractory CLL and in patients with treatment-naive CLL (Clinicaltrials.gov NCT02427451).

As part of Alliance A041702, a randomized phase III study in untreated older patients with CLL, clinical investigators had two treatment questions: (i) does limited front-line treatment with IVO and ibrutinib maintenance (IM) have superior PFS compared with limited front-line treatment with ibrutinib + obinutuzumab (IO) and IM and (ii) does the treatment strategy of IVO + IM for patients without MRD- CR and IVO + IM discontinuation for patients with MRD- CR have superior long-term PFS compared with IO + IM for all patients. To address both questions of interest in a single clinical trial design, a sequential multiple assignment randomized trial (SMART) design was considered.

A SMART design is a randomized clinical trial design for building adaptive interventions [11]. Adaptive interventions in which an individual’s sequence of treatments depends on observed outcomes are also known as treatment policies, multi-stage treatment strategies, and dynamic treatment regimes. In a trial that uses a SMART design, each patient is randomly allocated to an initial treatment and subsequent treatments based on response, patient characteristics, or behaviors observed during the previous treatment. The intensity of the treatment can stay the same, increase, or decrease at each treatment decision time point depending on the goals of the individual study. Each treatment decision time point defines the beginning or ending of a treatment stage. The treatment options at each stage and the criteria on which randomizations are based determine a set of adaptive interventions said to be embedded in the SMART.

Trials with SMART designs are suitable for the discovery of which sequential treatments work better than stand-alone treatments and can be used to investigate the interplay between treatment strategies and disease development. The repeated randomizations in a SMART design ensures that patients assigned to each of the embedded treatment regimens are balanced in terms of both observed and unobserved patient characteristics. Data from a SMART design can be used to simultaneously address the effectiveness of treatments at each stage of the trial as well as the effectiveness of the overall embedded treatment regimens. Consistent with the aim of the SMART design, the data are analyzed across stages and not separately by stage when determining the optimal treatment strategy.

SMART design strategies have often been used in social and behavioral sciences studies [12–15]. Precursors to SMART design strategies could be seen in cancer clinical trials for the treatment of acute myeloid leukemia (AML) [16], small-cell lung carcinoma [17], neuroblastoma [18, 19], and prostate cancer [20] but were not designed to analyze embedded treatment regimens. In each of these trials, patients were randomized to an initial therapy and, depending on response, re-randomized to a subsequent therapy. Results from the AML and small-cell lung carcinoma trials were reported separately for each stage, where all patients were included in the analysis of response before re-randomization, but only small subgroups of patients achieving remission were included in the analysis of patient outcomes post re-randomization. Results from the neuroblastoma trial were reported on subgroups of patients following a particular treatment sequence and showed survival differences between treatment sequences. The data have since been re-analyzed using all patients with statistical methods for comparing embedded treatment regimens and showed no survival differences among any of the embedded treatment regimens [18, 19, 21]. Results for the prostate cancer trial were initially reported using stage-specific analyses for 12 different treatment sequences (four first-stage treatments, and three second-stage treatments) and were reported across stages by defining overall treatment success as two consecutive favorable courses of therapy on patients with complete information. Data from this trial were re-analyzed, using all patients and with statistical methods for comparing embedded treatment regimens, with fairly consistent results [20, 22].

A recent SMART has been prospectively designed for patients with metastatic malignant melanoma [23]. In the first stage, patients will be randomized to one of two neurobehavioral therapies. If initial therapy is favorable, patients will continue on the same therapy. If initial therapy is unfavorable using a depression score, patients will either augment therapy or switch to the other drug. The analysis will compare embedded treatment regimens defined by the four different treatment paths an individual patient could take, with adherence rate to 12 weeks of therapy as the primary end point.

In general, there have been few cancer trials prospectively designed as a SMART in which the primary aim is to compare embedded treatment regimens [21]. There is a need to share information regarding the design, implementation, and analysis of cancer trials using SMART designs, particularly with time-to-event end points, which have not been well-represented in prospective SMART design analysis strategies. For this reason, we describe a SMART design approach considered for Alliance A041702, which compares PFS between embedded treatment regimens in the disease setting of CLL.

Design

According to the schema in Figure 1, previously untreated patients with CLL, aged 70 years or older and requiring therapy, would be randomized with equal allocation to the first-stage intervention with IO or IVO. Following the first-stage intervention, all patients would be evaluated for MRD and response, with MRD- disease defined as <1 CLL cell per 10 000 leukocytes in the bone marrow and CR according to the Revised IWCLL 2008 response criteria [24]. Regardless of MRD- CR status, patients randomized to IO would receive IM, the current standard of care, as the second-stage intervention. However, for patients randomized to IVO, the second-stage intervention would depend on the intermediate MRD- CR status. Patients without MRD- CR following IVO would receive IM, and patients with MRD- CR following IVO would be re-randomized 1 : 1 to either IM or IM discontinuation. Re-randomization of patients with MRD- CR status would only be implemented for those receiving IVO, but not for those receiving IO, since the anticipated rate of MRD- CR in the IO arm is low and discontinuation of IM in these patients could potentially raise ethical concerns. All patients would be followed for PFS as the primary end point, defined from first randomization date until the earlier of disease progression or death from any cause, censoring patients alive and progression-free at the last known clinical assessment date. Patients will have disease assessments on the first day of each 28-day cycle during the first-stage intervention. Thereafter, patients will be followed every three cycles for 6 years from registration and then every six cycles until progression, death, or 10 years from registration.

Figure 1.

Figure 1.

SMART Schema. Both randomizations (R) use the use the initial stratification factors of baseline Rai stage (intermediate versus high) and FISH abnormality del(17)(p13.1) at baseline (present versus absent). Ibrutinib maintenance (IM) in stage 2 continues until disease progression, death, or unacceptable adverse events. Patients randomized to IM discontinuation would not receive any ibrutinib therapy following re-randomization. Embedded treatment regimen (1, 1) includes patients randomized to ibrutinib plus obinutuzumab followed by IM, regardless of minimal residual disease negative with complete response (MRD- CR) status; embedded treatment regimen (2, 1) includes patients randomized to ibrutinib plus venetoclax plus obinutuzumab followed by IM, regardless of MRD- CR status; embedded treatment regimen (2, 2) includes patients randomized to ibrutinib plus venetoclax plus obinutuzumab followed by IM for those who do not achieve MRD- CR status and IM discontinuation for those who do achieve MRD- CR status.

A SMART design was proposed with use of a weighted log-rank statistic to compare PFS between embedded treatment regimens. Based on the schema (Figure 1), there are three embedded treatment regimens: (1, 1) includes patients randomized to IO, with IM for patients without MRD- CR (subgroup 1A) and IM for patients with MRD- CR (subgroup 1B); (2, 1) includes patients randomized to IVO, with IM for patients without MRD- CR (subgroup 2A) and IM for patients with MRD- CR (subgroup 2B); (2, 2) includes patients randomized to IVO, with IM for patients without MRD- CR (subgroup 2A) and IM discontinuation for patients with MRD- CR (subgroup 2C). Using the SMART design, we could compare long-term PFS between embedded treatment regimens (2, 2) and (1, 1) to determine whether more aggressive initial therapy and discontinued therapy for patients with the deepest responses is superior to less aggressive initial therapy with continued therapy for all patients. Use of the SMART design could also allow for comparison of short-term PFS between embedded treatment regimens (2, 1) and (1, 1) to determine whether IVO with IM is superior to IO with IM.

Results

Sample size and power calculations

Sample size and power calculations were carried out using simulated data specific to SMART trials with a time-to-event end point. The simLDTdata function of the DTR package in R was modified to generate data mirroring the assumptions and study design proposed for A041702 [25, 26]. Within each first-stage intervention, survival times for patients without MRD- CR and survival times for patients with MRD- CR following MRD/response evaluation were assumed to be exponential with hazard rates shown in Table 1. Supported by emerging data in the literature [8, 27, 28], it was assumed that 10% and 50% of patients randomized to IO and IVO, respectively, would achieve MRD- CR status after 1 year of therapy. Accrual over 3 years was assumed with a minimum follow-up of 5 years for the primary objective comparing embedded treatment regimens (1, 1) versus (2, 2). Censoring times between years 5 and 8 of the study were uniform. Kaplan–Meier curves derived from data generated under these assumptions are shown in Figure 2 by each first-stage intervention [29].

Table 1.

Assumed hazard rates with simulation-derived 18-month, 3-year, and 5-year progression-free survival (PFS) estimates for each subgroup included in the SMART design

End point Subgroup 1A Subgroup 1B Subgroup 2A Subgroup 2B Subgroup 2C
IO with IM for pts without MRD- CR IO with IM for pts with MRD- CR IVO with IM for pts without MRD- CR IVO with IM for pts with MRD- CR IVO with IM discontinuation for pts with MRD- CR
Hazard rate 0.075 0.050 0.0375 0.025 0.050
18-month PFS (%) 90 97 94 99 98
3-year PFS (%) 80 92 89 95 91
5-year PFS (%) 69 84 83 91 82

Subgroups 1A and 1B are consistent with embedded treatment regimen (1, 1); subgroups 2A and 2B are consistent with embedded treatment regimen (2, 1); subgroups 2A and 2C are consistent with embedded treatment regimen (2, 2). The simulation dataset contained 10 000 observations with 5000 observations allocated to each first-stage intervention. Embedded treatment regimen (1, 1) includes patients randomized to IO as the first-stage intervention, with IM as the second-stage intervention for patients without MRD- CR (subgroup 1A) and IM as the second-stage intervention for patients with MRD- CR (subgroup 1B); (2, 1) includes patients randomized to IVO as the first-stage intervention, with IM as the second-stage intervention for patients without MRD- CR (subgroup 2A) and IM as the second-stage intervention for patients with MRD- CR (subgroup 2B); (2, 2) includes patients randomized to IVO as the first-stage intervention, with IM as the second-stage intervention for patients without MRD- CR (subgroup 2A) and IM discontinuation as the second-stage intervention for patients with MRD- CR (subgroup 2C).

IO, ibrutinib plus obinutuzumab; IM, ibrutinib maintenance; pts, patients; MRD- CR, minimal residual disease negative with complete remission; IVO, ibrutinib plus venetoclax plus obinutuzumab; PFS, progression-free survival.

Figure 2.

Figure 2.

Kaplan–Meier curves derived from data generated under study-specific assumptions. (A) Kaplan–Meier curves with the first-stage intervention ibrutinib plus obinutuzumab (IO) and the second-stage intervention ibrutinib maintenance (IM), for all simulated patients and for subgroups of simulated patients who did and did not achieve minimal residual disease negative with complete response (MRD- CR) status. (B) Kaplan–Meier curves with first-stage intervention ibrutinib plus venetoclax plus obinutuzumab (IVO) and second-stage intervention IM or IM discontinuation, for all simulated patients and for subgroups of simulated patients who did and did not achieve MRD- CR status.

Assumptions on hazard rates for each subgroup defined by the combination of the first-stage intervention, MRD- CR status, and second-stage intervention, were carefully considered (Table 1). First, we assumed hazard rates for subgroups 1A and 1B would correspond with the 18-month PFS estimate of 90% reported for a similar patient population receiving ibrutinib alone [3], under the presumption that PFS would be no worse with IO. We next assumed that the hazard rate following 1 year of IO with IM for patients with MRD- CR (subgroup 1B) would be 0.67 times the hazard rate of those who followed the same treatment path but did not have MRD- CR (subgroup 1A). It was hypothesized that the risk of an event would be substantially lower for those receiving IVO versus IO with continued IM, and thus the hazard rates for subgroups 2A and 2B were 0.5 times the hazard rates of subgroups 1A and 1B, respectively. Lastly, we assumed equal hazard rates for subgroups 1B and 2C under the hypothesis that risk would be the same for patients who received IVO with IM discontinuation as patients who received IO with continued IM. A rough approximation of the resulting hazard rates for each embedded treatment regimen were estimated using a simple weighted average across subgroups [i.e. 0.9 × 0.075 + 0.10 × 0.05 = 0.0725, 0.5 × 0.0375 + 0.5 × 0.025 = 0.03125, and 0.5 × 0.0375 + 0.5 × 0.05 = 0.04375 for embedded treatment regimens (1, 1), (2, 1), and (2, 2), respectively]. Kaplan–Meier curves for simulated patients consistent with each embedded treatment regimen under the study assumptions are shown in Figure 3.

Figure 3.

Figure 3.

Kaplan–Meier curves for each embedded treatment regimen, derived from data generated under study-specific assumptions. Embedded treatment regimen (1, 1) includes simulated patients in the ibrutinib plus obinutuzumab (IO) arm with ibrutinib maintenance (IM); embedded treatment regimen (2, 1) includes simulated patients in the ibrutinib plus venetoclax plus obinutuzumab (IVO) arm with IM, regardless of minimal residual disease negative with complete response (MRD- CR) status; embedded treatment regimen (2, 2) includes simulated patients in the IVO arm with IM for those who did not achieve MRD- CR status and IM discontinuation for those who did achieve MRD- CR status.

Sample size was calculated using simulated datasets so that hazard ratios approximately equal to 0.60 ((2, 2) versus (1, 1)) and 0.43 ((2, 1) versus (1, 1)) could be detected with at least 90% power while constraining the type I error to 0.025 for each comparison. PFS distributions of embedded treatment regimens were compared for each simulated dataset using a weighted robust score test as described in Wolbers and Helterbrand [30]. Weights were assigned to each simulated patient using the inverse probability-weighted method to account for the restricted re-randomization of patients with MRD- CR who had received IVO [31]. Specifically, when comparing embedded treatment regimen (2, 2) with (1, 1), simulated patients from subgroups 1A, 1B, and 2A were assigned a weight of 1 and simulated patients from subgroup 2C were assigned a weight of 2; hence, simulated patients in subgroup 2C accounted for themselves and also another patient who had been randomized to subgroup 2B but excluded from the comparison. Likewise, when comparing embedded treatment regimen (2, 1) with (1, 1), simulated patients from subgroups 1A, 1B, and 2A were assigned a weight of 1 and simulated patients from subgroup 2B were assigned a weight of 2. For each comparison, if PFS was in favor of embedded treatment regimen (2, 2) or (2, 1), respectively, and the one-sided P value from the robust score test was <0.025, then superiority was claimed. Otherwise, futility was claimed. This process was repeated 10 000 times for each comparison, and the sample size was increased until at least 90% of simulations claimed superiority for each of the comparisons.

With 244 patients randomized to each first-stage intervention, there was 90% power to detect superior PFS of embedded treatment regimen (2, 2) versus (1, 1) and 98% power to detect superior PFS of embedded treatment regimen (2, 1) versus (1, 1), with control of one-sided type I error equal to 0.025 for each comparison. Additional power calculations are provided in Table 2 for a range of MRD- CR rates that are considered plausible for these agents, while fixing all other assumed parameters. For a variety of scenarios, sufficient power was still achieved.

Table 2.

Power to detect superior progression-free survival of embedded treatment regimen (2, 2) versus (1, 1) and (2, 1) versus (1, 1), when different bone marrow minimal residual disease negativity with complete remission (BM MRD- CR) rates are assumed for patients randomized to first-stage interventions ibrutinib + obinutuzumab (IO) and ibrutinib + venetoclax + obinutuzumab (IVO)

BM MRD- CR rate with IO BM MRD- CR rate with IVO
30% 40% 50% 60% 70%
5% 95.1/97.4 93.6/98.3 92.6/99.0 90.9/99.2 89.1/99.5
10% 93.0/96.4 91.4/97.6 90.4/98.1 88.2/98.8 85.8/99.2
15% 90.5/95.0 88.8/96.2 86.7/97.5 84.8/98.2 82.2/99.0

Power to detect superior progression-free survival is first presented for the comparison of embedded treatment regimen (2, 2) versus (1, 1), and then for the comparison of embedded treatment regimen (2, 1) versus (1, 1), separated by a/.

IO, ibrutinib plus obinutuzumab; MRD- CR, minimal residual disease negative with complete remission; IVO, ibrutinib plus venetoclax plus obinutuzumab.

Planned analyses

The primary aim was to compare embedded treatment regimens (2, 2) versus (1, 1) and (2, 1) versus (1, 1). Embedded treatment regimens would be compared using the weighted robust score test of a proportional hazards model as described above [30, 31]. In secondary analyses, comparisons would be adjusted for important baseline clinical and molecular markers of interest. In addition to testing sequential treatment effects, stage-specific treatment effects would be determined as part of secondary aims. MRD- CR rates would be compared between first-stage interventions using a standard chi-square test. Logistic regression would be used to correlate first-stage intervention with MRD- CR status, adjusting for stratification factors and important baseline clinical and molecular markers. Among patients who start IM therapy and have MRD and response assessments following the first-stage intervention, MRD- CR status would be correlated with PFS using standard proportional hazards models, adjusting for first-stage intervention and other important covariates measured at baseline or before IM therapy. These analyses would support further tailoring of sequential treatments and potentially identify additional patients who may be candidates for IM discontinuation.

Interim monitoring

One futility analysis was proposed for the comparison of embedded treatment regimens (2, 2) versus (1, 1) when half of the expected number of events have occurred. At that time, if the hazard ratio was >1.00 and in favor of embedded treatment regimen (1, 1), then the treatment strategy of embedded treatment regimen (2, 2) would not warrant further study. Thereafter, re-randomization to the second-stage intervention would discontinue and patients initially randomized to IVO who achieve MRD- CR status after the first-stage intervention would continue to receive IM. A permanent closure of the trial would not allow for the comparison of embedded treatment regimens (2, 2) versus (1, 1) to be completed, but the comparison of embedded treatment regimens (2, 1) versus (1, 1) could still be evaluated.

Sample size calculation with binary end points

In this study design, PFS was a time-to-event end point and we used simulation methods to deduce that the total required sample size was 488 patients. In the case where a binary end point is appropriate, the sample size required to compare embedded treatment regimens with different first-stage interventions could be calculated using a Wald test based on the difference in failure rates at a predetermined time point. An example with sample size calculations shown in the context of the SMART design considered for A041702 is provided in the Appendix. Treating PFS as a binary end point and comparing embedded treatment regimens requires a total sample size of 735 patients, over 1.5 times as many patients as required when treating PFS as a time-to-event end point. The increase in sample size when comparing PFS proportions at a single time point as opposed to comparing PFS across curves is expected [32]. The total required sample size using a binary end point to compare embedded treatment regimens is also higher than the total sample size of 608 patients required for a simple two-sample proportions test, since it must account for a second randomization.

Alternate trial designs

Several alternate trial designs were considered for A041702. Under the same schema used for the SMART design (Figure 1), a standard group sequential design could be used to compare PFS between patients randomized to IO with IM versus IVO with IM. The subgroup of patients randomized to IVO who achieve MRD- CR and are re-randomized to IM discontinuation would be excluded from the primary analysis. This approach however is problematic since systematic inclusion (or exclusion) of a subgroup of re-randomized patients based on an intermediate outcome, with an analysis using the standard log-rank test, results in a biased comparison [15, 30]. Further, the clinical outcome for the subgroup of patients re-randomized to IM discontinuation would be secondary and potentially uninformative.

A second option considered was to simplify the design and conduct a conventional two-arm randomized trial comparing (i) IO with IM versus (ii) IVO with IM or (iii) IVO with IM for patients without MRD- CR and IVO with IM discontinuation for patients with MRD- CR, similar to embedded treatment regimens (1, 1), (2, 1), and (2, 2), respectively in the SMART design. The perceived complexity associated with a SMART design and second randomization would be eliminated. Further, fewer patients would be required for a conventional two-arm comparison. However, if arms 1 and 2 were included in the design, then the treatment strategy of discontinuing IM therapy in patients with the deepest responses could not be evaluated. If arms 1 and 3 were included in the design, then only a subgroup analysis restricted to patients without an MRD- CR could evaluate whether IVO with IM was superior to IO with IM; this analysis would be descriptive in nature and could not be generalized to all patients as in the SMART design. In addition, secondary comparisons evaluating the impact of MRD- CR status could no longer be carried out since MRD- CR status would be confounded with the second-stage intervention.

Lastly, a three-arm randomized clinical trial could be conducted with patients randomized to each of the three arms described above, with formal comparisons of arms 2 and 3 versus 1. The primary disadvantage of this more standard approach is the much larger sample size required with a 3-arm trial than with the corresponding SMART design that borrows information from patients who do not achieve MRD- CR across embedded treatment regimens (2, 1) and (2, 2) [30, 31].

Discussion

We have shown that a SMART design strategy is directly applicable to a cancer clinical trial in which treatment decisions are guided by intermediate patient outcomes. At the completion of a SMART, a rich data source exists to identify an optimal treatment strategy that, if followed by all patients in the population, would lead to the most beneficial outcome on average. With an emphasis on precision medicine in cancer research, estimating an optimal regime is of utmost importance and SMART designs will likely become more applicable. In addition to the evaluation of adaptive interventions through SMART designs, the discovery of individualized optimal regimens through a new generation of reinforcement learning trials will also likely become more applicable [33].

Directly impacting older patients with CLL, the SMART design considered for A041702 allowed us to address two questions: (i) is IVO with IM superior to IO with IM and (ii) is the treatment strategy of IVO with IM for patients without MRD- CR and IVO with IM discontinuation for patients with MRD- CR superior to IO with IM. Ultimately however, a conventional design comparing the two treatment strategies to answer only the second question was selected as the final design for A041702. With the same error constraints, interim monitoring for superiority and futility, and assuming a 5-year PFS estimate of 70% for the control treatment strategy, a hazard ratio of 0.55 could be detected with 431 patients. This is a 12% decrease in sample size from the proposed SMART design that included a second randomization with a discontinuation component.

We have shown that standard software can be used to provide sample size, power calculations, and data analysis for a SMART design. However, additional challenges and questions arise when implementing a SMART design compared with a conventional randomized clinical trial. Although not an impediment to the design itself, adequate infrastructure must exist to assign patients to different treatments at two different randomization time points, and intermediate outcomes must be assessed and entered into a database in a reasonable time frame to allow for re-randomization. Statistically, decisions regarding patients who are lost to follow-up before the intermediate outcome assessment must be determined. In the proposed study analysis, progressions and deaths before the intermediate outcome would be treated as events and patients lost to follow-up before the intermediate outcome would be censored and included in the subgroup of patients without MRD- CR. Patients discontinuing therapy before the intermediate outcome would continue to be followed for progression and included in the subgroup of patients without MRD- CR. These patients would be required to submit an end-of-treatment sample to assess MRD and would continue to be followed for response, permitting a sensitivity analysis with patients who are MRD- at the end of treatment and in CR at the planned assessment to be included in the MRD- CR with IM subgroup.

In the proposed study design, the intermediate outcome of MRD- CR is assessed once, at a fixed time point upon completion of the first-stage intervention. The intermediate outcome is not assessed more frequently due to the impracticality of obtaining repeat bone marrows. However, if a noninvasive biomarker was available, incorporating more frequent intermediate outcome assessments could be valuable.

The inclusion of interim analyses for superiority and futility should be considered when planning a SMART design. To our knowledge, programs have not been developed to incorporate interim analysis boundaries with preservation of type I and II error when making sample size and power calculations for time-to-event or binary end points. In the proposed study design with one futility analysis for the comparison of long-term PFS, we did not make an adjustment in the sample size, but this is an ongoing area of research.

Conclusion

The SMART design strategy is represented in the context of a cancer clinical trial in older patients with CLL, along with design assumptions and a statistical analysis plan. In randomized trials that guide patient treatment based on an intermediate outcome, a SMART design can be used to identify appropriate treatment strategies while maintaining statistical rigor and should be considered.

Funding

This work was supported by the National Cancer Institute at the National Institutes of Health [grant numbers U10 CA180882 to ASR, JY, and SJM, U10 CA180850 to JCB and JAW, R35 CA197734 to JCB and ASR, P01 CA142538 to MD and AAT].

Disclosure

The authors have declared no conflicts of interest.

Appendix

The sample size formula for a one-sided test comparing two embedded treatment regimens with different first-stage interventions is as follows:

n total = zα + zβ2* q1r  + q2(1-r)  Δ2, where zα and zβ are the critical values of the Normal distribution at α and β, q1and q2 are non-centrality parameters for each embedded treatment regimen, r is the initial randomization probability used to assign patients to a first-stage intervention, and Δ is the difference in overall failure rates (i.e., failure rate = 1 – success rate) at a particular time point.

The non-centrality parameter for each embedded treatment regimen (i, j) is calculated using the following formula:

q. = aijT * (Dij - pij * pijT) * aij, where aij is a 3 × 1 vector of weights corresponding with pij, a 3 × 1 vector of failure probabilities, and Dij is a diagonal matrix with the elements of pij along the diagonal. Specifically, the first element of pij is the probability of failure before the second randomization given first-stage intervention i; the second element of pij is the probability of failure during the second-stage intervention given an intermediate outcome j with first-stage intervention i; the third element of pij is the probability of failure during the second-stage intervention given a different intermediate outcome j with first-stage intervention i.

Suppose we compare a binary end point between embedded treatment regimens (1, 1) and (2, 2) as defined in A041702 using ibrutinib + obinutuzumab (IO) and ibrutinib + venetoclax + obinutuzumab (IVO) as first-stage interventions and minimal residual disease (MRD-) status as the intermediate outcome. Suppose we assume equal allocation to the first-stage interventions, an MRD- rate of 10% with IO, an MRD- rate of 50% with IVO, and equal allocation to IM or IM discontinuation as a second-stage intervention among patients who are treated with IVO and achieve MRD- CR. Further, suppose we assume overall failure rates of 30%, 15%, 15%, and 20% for the four respective groups (1) IO/MRD+/IM, (2) IO/MRD-/IM, (3) IVO/MRD+/IM, and (4) IVO/MRD-/IM discontinuation. This results in assumed overall failure rates of 28.5% and 17.5% for embedded treatment regimens (1, 1) and (2, 2). Lastly, we assume a 5% failure rate with each of the first-stage interventions, allowing calculation of failure rates during the second-stage intervention given the intermediate outcome and first-stage intervention. Using a one-sided α = 0.025 and β = 0.10 results in a total sample size of 735 patients (n =368 randomized to each first-stage intervention). Details regarding the values used in the sample size calculation are provided below.

ntotal=(1.96+1.28)2* (0.20380.5  + 0.2194(10.5) ) 0.112=735,

where a11T = [1 1 1], p11T = [0.0500 0.2250 0.0100], a22T = [1 1 2], and p22T = [0.0500 0.0500 0.0375].

By comparison, a usual test of proportions comparing 0.285 with 0.175, with the same type I and II errors, and no second randomization, requires a total sample size of 608 patients (n =304 randomized to each intervention). Thus, re-randomization following IVO as the first-stage intervention, where only half of the patients who have MRD- CR continue with IM, requires 64 additional patients per group or approximately a 20% increase in total sample size.

References

  • 1. Eichhorst B, Dreyling M, Robak T. et al. Chronic lymphocytic leukemia: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol 2011; 22(Suppl 6): vi50–vi54. [DOI] [PubMed] [Google Scholar]
  • 2. Gribben JG. Salvage therapy for CLL and the role of stem cell transplantation. Hematology Am Soc Hematol Educ Program 2005; 2005(1): 292–298. [DOI] [PubMed] [Google Scholar]
  • 3. Burger JA, Tedeschi A, Barr PM. et al. Ibrutinib as initial therapy for patients with chronic lymphocytic leukemia. N Engl J Med 2015; 373(25): 2425–2437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. O’Brien S, Furman RR, Coutre S. et al. Single-agent ibrutinib in treatment-naïve and relapsed/refractory chronic lymphocytic leukemia: a 5-year experience Blood 2018; 131: 1910–1919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Byrd JC, Furman RR, Coutre SE. et al. Targeting BTK with ibrutinib in relapsed chronic lymphocytic leukemia. N Engl J Med 2013; 369(1): 32–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Herman SEM, Gordon AL, Hertlein E. et al. Bruton tyrosine kinase represents a promising therapeutic target for treatment of chronic lymphocytic leukemia and is effectively targeted by PCI-32765. Blood 2011; 117(23): 6287–6296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Burger JA, Sivina M, Ferrajoli A. et al. Randomized trial of ibrutinib versus ibrutinib plus rituximab in patients with chronic lymphocytic leukemia (CLL). In Presented at American Society of Hematology 59th Annual Meeting (Abstr 427). Atlanta, GA December 9–12, 2017.
  • 8. Woyach JA, Ruppert AS, Heerema A. et al. Ibrutinib regimens versus chemoimmunotherapy in older patients with untreated CLL. N Engl J Med 2018; 379(26): 2517–2528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Cervantes-Gomez F, Lamothe B, Woyach JA. et al. Pharmacological and protein profiling suggests venetoclax (ABT-199) as optimal partner with ibrutinib in chronic lymphocytic leukemia. Clin Cancer Res 2015; 21(16): 3705–3715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Roberts AW, Davids MS, Pagel JM. et al. Targeting BCL2 with venetoclax in relapsed chronic lymphocytic leukemia. N Engl J Med 2016; 374(4): 311–322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Murphy SA. An experimental design for the development of adaptive treatment strategies. Stat Med 2005; 24(10): 1455–1481. [DOI] [PubMed] [Google Scholar]
  • 12. Lavori PW, Dawson R.. A design for testing clinical strategies: biased adaptive within-subject randomization. J R Stat Soc 2000; 163(1): 29–38. [Google Scholar]
  • 13. Lavori PW, Dawson R.. Dynamic treatment regimes: practical design considerations. Clin Trials 2004; 1(1): 9–20. [DOI] [PubMed] [Google Scholar]
  • 14. Insel TR. Beyond efficacy: the STAR*D trial. Am J Psychiatry 2006; 163(1): 5–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Nahum-Shani I, Qian M, Almirall D. et al. Experimental design and primary data analysis methods for comparing adaptive interventions. Psychol Methods 2012; 17(4): 457–477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Stone RM, Berg DT, George SL. et al. Granulocyte-macrophage colony-stimulating factor after initial chemotherapy for elderly patients with primary acute myelogenous leukemia. N Engl J Med 1995; 332(25): 1671–1677. [DOI] [PubMed] [Google Scholar]
  • 17. Tummarello D, Mari D, Graziano F. et al. A randomized, controlled phase III study of cyclophosphamide, doxorubicin, and vincristine with etoposide (CAV-E) or teniposide (CAV-T), followed by recombinant interferon-alpha maintenance therapy or observation, in small cell lung carcinoma patients with complete responses. Cancer 1997; 80(12): 2222–2229. [PubMed] [Google Scholar]
  • 18. Matthay KK, Villablanca JG, Seeger RC. et al. Treatment of high-risk neuroblastoma with intensive chemotherapy, radiotherapy, autologous bone marrow transplantation, and 13-cis-retinoic acid. N Engl J Med 1999; 341(16): 1165–1173. [DOI] [PubMed] [Google Scholar]
  • 19. Matthay KK, Reynolds CP, Seeger RC. et al. Long-term results for children with high-risk neuroblastoma treated on a randomized trial of myeloablative therapy followed by 13-cis-retinoic acid: a children’s oncology group study. JCO 2009; 27(7): 1007–1013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Thall PF, Logothetis C, Pagliaro LC. et al. Adaptive therapy for androgen-independent prostate cancer: a randomized selection trial of four regimens. J Natl Cancer Inst 2007; 99(21): 1613–1622. [DOI] [PubMed] [Google Scholar]
  • 21. Kidwell KM. Smart designs in cancer research: past present, and future. Clin Trials 2014; 11(4): 445–456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Wang L, Rotnitzky A, Lin X. et al. Evaluation of viable dynamic treatment regimes in a sequentially randomized trial of advanced prostate cancer. J Am Stat Assoc 2012; 107(498): 493–508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Auyeung SF, Long Q, Royster EB. et al. Sequential multiple-assignment randomized trial design of neurobehavioral treatment for patients with metastatic malignant melanoma undergoing high-dose interferon-alpha therapy. Clin Trials 2009; 6(5): 480–490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Hallek M, Cheson BD, Catovsky D. et al. Guidelines for the diagnosis and treatment of chronic lymphocytic leukemia: a report from the International Workshop on Chronic Lymphocytic Leukemia updating the National Cancer Institute-Working Group 1996 guidelines. Blood 2008; 111(12): 5446–5456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Tang X, Melguizo M.. DTR: an R package for estimation and comparison of survival outcomes of dynamic treatment regimes. J Stat Softw 2015; 65(7): 1–28. [Google Scholar]
  • 26.R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, 2016. https://www.R-project.org/ (1 March 2019, date last accessed).
  • 27. Seymour JF, Kipps TJ, Eichhorst B. et al. Venetoclax-rituximab in relapsed or refractory chronic lymphocytic leukemia. N Engl J Med 2018; 378(12): 1107–1120. [DOI] [PubMed] [Google Scholar]
  • 28. Moreno C, Greil R, Demirkan F. et al. Ibrutinib plus obinutuzumab versus chlorambucil plus obinutuzumab in first-line treatment of chronic lymphocytic leukaemia (iLLUMINATE): a multicentre, randomised, open-label, phase 3 trial. Lancet Oncol 2019; 20(1): 43–56. [DOI] [PubMed] [Google Scholar]
  • 29. Kaplan EL, Meier P.. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958; 53(282): 457–481. [Google Scholar]
  • 30. Wolbers M, Helterbrand JD.. Two-stage randomization designs in drug development. Stat Med 2008; 27(21): 4161–4174. [DOI] [PubMed] [Google Scholar]
  • 31. Lunceford JK, Davidian M, Tsiatis AA.. Estimation of survival distributions of treatment policies in two-stage randomization designs in clinical trials. Biometrics 2002; 58(1): 48–57. [DOI] [PubMed] [Google Scholar]
  • 32. Rubinstein L. Phase II design: history and evolution. Chin Clin Oncol 2014; 3(4): 48.. [DOI] [PubMed] [Google Scholar]
  • 33. Zhao Y, Kosorok MR, Zeng D.. Reinforcement learning design for cancer clinical trials. Stat Med 2009; 28(26): 3294–3315. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Annals of Oncology are provided here courtesy of Oxford University Press

RESOURCES