Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Jun 11.
Published in final edited form as: Stat Methods Med Res. 2025 Aug 11;34(10):2069–2090. doi: 10.1177/09622802251360409

Selecting the optimal longitudinal cluster randomized design with a continuous outcome: parallel-arm, crossover or stepped-wedge

Jingxia Liu a,b, Fan Li c,d, Siobhan Sutcliffe a, Graham A Colditz a
PMCID: PMC13251621  NIHMSID: NIHMS2171977  PMID: 40785501

Abstract

The optimal designs (ODs) for parallel-arm longitudinal cluster randomized trials (PA-LCRTs), multiple-period cluster randomized crossover (CRXO) trials and stepped wedge cluster randomized trials (SW-CRTs), including closed-cohort and repeat cross-sectional designs, have been studied separately under a cost-efficiency framework based on generalized estimating equations (GEEs). However, whether a global OD exists across longitudinal designs and randomization schedules remains unknown. Therefore, this research addresses a critical gap by comparing OD feature across complete LCRT designs with two treatment conditions and continuous outcomes. We define the OD as the design with either the lowest cost to obtain a desired level of power or the largest power given a fixed budget. For each of these ODs, we obtain the optimal number of clusters and the optimal cluster-period size (number of participants per cluster per period). To ensure equitable comparisons, we consider the GEE treatment effect estimator with the same block exchangeable correlation structure and develop OD algorithms with the lowest cost for each of six study designs. To obtain OD with the largest power, we summarize the previous and propose new OD algorithms and formulae. We suggest using the number of treatment sequences L=T-1, where T is the number of time-periods, in both the optimal closed-cohort and repeated cross-sectional SW-CRTs to have the lowest cost. This is consistent with our previous findings for ODs with the largest power in SW-CRTs. Comparing all six ODs, we conclude that optimal closed-cohort CRXO trials are global ODs, yielding both the lowest cost and largest power.

Keywords: Block exchangeable correlation structure, cluster randomized crossover (CRXO), parallel-arm longitudinal cluster randomized trial (PA-LCRT), optimal design (OD) under a budgetary constraint, stepped wedge cluster randomized trials (SW-CRTs)

1. Introduction

Cluster randomized trials (CRTs) are commonly used designs in implementation science and pragmatic clinical and educational research.1,2 These designs, which randomize participants at the cluster rather than the individual level, are often performed when randomization at the participant level would be infeasible and/or would lead to contamination of the intervention and biased estimation of the intervention effect.3 Therefore, CRTs are increasingly used to evaluate the real-world impact of health and educational interventions at higher institutional levels, such as hospitals, medical practices, and schools.

1.1. Overview of longitudinal CRT designs

Three main longitudinal CRTs (LCRTs) will be considered in this article: parallel-arm longitudinal cluster randomized trials (PA-LCRTs), cluster randomized crossover (CRXO) trials, and stepped wedge cluster randomized trials (SW-CRTs). Each of these designs randomizes participants at the cluster level (e.g. medical center, clinic, ward, or classroom). However, they vary in terms of which clusters receive the intervention and when during the trial (i.e. time-period) they receive this intervention. Table 1 provides example schematics for three possible longitudinal cluster randomized designs with 20 clusters.

Table 1.

Example schematics for three possible longitudinal cluster randomized designs with m=20 clusters

(a) An example of a PA-LCRT intervention schedule
Cluster Treatment sequence Time Periods
Period 1 Period 2 Period 3 Period 4 Period 5 Period 6
1,⋯,10 1 Intervention Intervention Intervention Intervention Intervention Intervention
11,⋯,20 2 Control Control Control Control Control Control
(b) An example of a CRXO trial intervention schedule
Cluster Treatment sequence Time Periods
Period 1 Period 2 Period 3 Period 4 Period 5 Period 6
1,⋯,10 1 Intervention Control Intervention Control Intervention Control
11,⋯,20 2 Control Intervention Control Intervention Control Intervention
(c) An example of a SW-CRT intervention schedule
Cluster Treatment sequence Time Periods
Period 1 (Baseline) Period 2 (Step 1) Period 3 (Step 2) Period 4 (Step 3) Period 5 (Step 4) Period 6 (Extended period)
1,⋯,5 1 Control Intervention Intervention Intervention Intervention Intervention
6,⋯,10 2 Control Control Intervention Intervention Intervention Intervention
11, ⋯,15 3 Control Control Control Intervention Intervention Intervention
16,⋯,20 4 Control Control Control Control Intervention Intervention

The first LCRT that we consider, PA-LCRTs, randomizes clusters to one of two treatment conditions. All participants within a given cluster receive the same treatment assignment and participants remain in their assigned treatment condition for the duration of the trial. In a typical PA-LCRT, participants within a cluster are often measured at multiple time points. For example, Faggiano et al. investigated the effect of a school-based substance abuse prevention program in which schools were randomly assigned to one of four experimental groups.4 The behavioral endpoints from each student were collected repeatedly at the baseline, 6-month and 18-month assessments. In this type of design, measurements across different time points are correlated within a participant (if a closed-cohort is considered), and measurements across participants at the same time point are correlated within a cluster.

Similar to PA-LCRTs, cluster randomized crossover (CRXO) trials randomize clusters, but unlike PA-LCRTs, to a sequence of treatment conditions rather than to one treatment condition. Within each cluster, participants receive the same treatment condition in each time-period. For example, for two-period CRXO trials, all enrolled clusters are randomized to the assignments of either IC (intervention in the first time-period, followed by control in the second time-period) or CI (control in the first time-period, followed by intervention in the second time-period). CRXO trials may require a washout period between two consecutive time-periods to minimize the carryover effect. For example, Jeyaratnam et al. conducted a two-period CRXO trial to determine whether a rapid screening test leads to a reduction in methicillin resistant Staphylococcus aureus (MRSA) acquisition on hospital general wards.5 They randomized 10 wards to receive either rapid screening for MRSA or conventional culture screening. This study included a three-month baseline period, a five-month first intervention period, a one-month washout period, and a five-month second intervention period. By leveraging both within- and between-cluster comparisons, CRXO trials are resource efficient and could be statistically more powerful than other designs.

The third major LCRT that we consider, stepped wedge cluster randomized trials (SW-CRTs), is similar to CRXO trials, except that it staggers treatment assignments over time. All participants within a cluster receive the same treatment in each time-period, but not all clusters experience each treatment for the same amount of time. In a typical SW-CRT, 1) all clusters start from the control condition at time-period 1 (the baseline period); 2) at each subsequent time-period, a subset of clusters is randomized to receive the intervention condition and to maintain the intervention status until the end of the study; and 3) at the end of the study, all clusters receive the intervention. SW-CRTs are particularly attractive to stakeholders when they perceive the intervention to be beneficial and when implementation of the intervention is logistically more feasible for a smaller fraction of enrolled clusters at an earlier time-period.

For all three types of LCRTs, we will consider two different sampling designs in our development: closed-cohort and repeated cross-sectional designs. In closed-cohort designs, the same participants are followed across time-periods and the outcome is measured in each time-period, whereas in repeated cross-sectional designs, different participants are enrolled in each time-period, and the outcome is collected within each of these time-periods.6,7

1.2. Variance and sample size considerations for LCRTs

Given their cluster designs, investigators must consider the following three correlations in sample size calculations and statistical analyses of LCRTs: “within-period,” “inter-period,” and “within-participant,” correlations. The “within-period correlation” measures the similarity in outcomes between two participants in the same cluster and time-period. The “inter-period correlation” measures the similarity in outcomes between two participants in the same cluster but from different time-periods; and the “within-participant correlation” measures the similarity in outcomes between the same participant across time-periods and is typically only needed under a closed-cohort design.

Taking these correlations into account, previous studies have derived variance formulae for: 1) PA-LCRTs, using generalized estimating equations (GEEs) with at least two correlations parameters;8,9 and 2) T-period closed-cohort and repeated cross-sectional CRXO trials, assuming that T is a multiple of 2 and that treatment assignments repeat the same assignment pattern after the first two periods (e.g., intervention, control, intervention, control, …, etc.).1014 Investigators have also derived sample size formulae for closed-cohort and repeated cross-sectional SW-CRTs with two treatment conditions.1525

In addition to variance and sample size formulae, researchers have proposed optimal sample sizes for CRTs, including the optimal number of clusters and cluster-period size (i.e. number of participants per cluster per period) as a function of cost and intracluster correlation coefficients for a given budget.2632 When correlation parameters are known, optimization provides the sample size that generates the minimum variance of the estimated treatment effect (ETE). The optimal designs (ODs) for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs including closed-cohort and repeat cross-sectional designs have also been studied separately using GEEs under a cost-efficiency framework.14,33 Specifically for multiple-period CRXO trials and assuming linear mixed models, Grantham et al. studied the optimal number of crossovers for maximizing efficiency and cost-efficiency,34 whereas Moerbeek et al. studied the optimal number of time-periods when a treatment switches at the end of each time-period, and the optimal number of treatment switches with a fixed number of time-periods.35 However, whether global ODs exist across study designs and randomization schedules remains unknown. Therefore, this research addresses a critical gap by comparing OD features across each of the following six designs: PA-LCRTs, multiple-period CRXO trials, and SW-CRTs with two different sampling designs (repeated cross-sectional or closed-cohort designs) each. Importantly, our work also differs from previous investigations comparing different LCRTs that assumed a fixed sample size without optimization and budget restrictions.15,21 For example, Hemming and Taljaard found that, given the same total sample size, SW-CRTs can be either more or less efficient than PA-LCRTs, depending on the value of the assumed correlation parameter.15 Such observations may or may not be generalizable to the comparison between optimal SW-CRT and optimal PA-LCRT designs, because the OD itself is a nonlinear function of the study budget and correlation parameters.

To ensure equitable comparisons across designs, we consider the same statistical model for all three LCRTs—a marginal model with constant intervention effect and categorical period effects estimated by GEEs where the treatment effect parameter carries a population-averaged interpretation.36 For PA-LCRTs, the treatment assignment for clusters is either I (intervention) or C (control), and π is the proportion of clusters receiving the I assignment. For CRXO trials, the treatment sequence assignment has repeating patterns of either IC or CI (e.g., ICICIC or CICICI for T=6) and π is the proportion of clusters receiving the IC treatment sequence. Finally, for SW-CRTs, the treatment assignment is also described by L, the number of treatment sequences . For example, for T=6,L could either be 3, 4, or 5. When L=3, clusters are allocated in one of the following treatment sequences: CIIIII, CCIIII, and CCCIII; when L=4, possible sequences are: CIIIII, CCIIII, CCCIII, and CCCCII; and when L=5, sequences are: CIIIII, CCIIII, CCCIII, CCCCII, and CCCCCI. In other words, SW-CRTs with L=3or4 including additional periods where all clusters are in the intervention condition (sometimes referred to as a maintenance phase).37 Thus, for the purpose of our evaluation, we first consider how to choose the number of treatment sequences L in SW-CRTs. In addition, this research mainly lies optimizing the sample size configuration for balanced assignment to each sequence, allowing for a maintenance phase of the trial. Then, using this suggested L, we compare different LCRT designs to address the question “which design requires the lowest cost to obtain a desired level of power or has the largest power under a fixed budget”. Addressing this research question can help investigators decide the merit of each design option and potentially maximize cost efficiency at the study planning stage.

This article is organized as follows. In Section 2, we provide an overview of ETE variance formulae from GEE analyses for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs with two treatment conditions and continuous outcomes. Section 3 proposes the OD with the lowest cost at a desired level of power for each study design and then compares ODs across six study designs. These include closed-cohort and repeated cross-sectional PA-LCRTs, multiple-period CRXO trials, and SW-CRTs. In Section 4, we illustrate our proposed ODs through a real example. Finally, in Section 5 we discuss our findings and offer suggestions for future research. Data sharing is not applicable to this article as no new data are created or analyzed in this study.

2. Statistical models and GEEs in LCRTs: A review

Let Yijt be a continuous outcome from participant j=1,,n in cluster i=1,,m and time-period t=1,,T. The marginal model is μijt=δt+Xitβ, where μijt is a marginal mean, δt is the tth time-period effect, Xit is the treatment indicator of cluster i in time-period t (=1 if receiving intervention, 0 otherwise), and β is the treatment effect of interest. Note that β is assumed to be constant over time periods; δt+β represents the intervention effect while δt represents the control effect at the tth time period. The number of participants (or observations) per cluster per period n is generally referred to as the cluster-period size. Here we assume that every participant contributes complete measurements (i.e., no missing outcomes). For a continuous outcome with mean 0 and variance σ2, the hypotheses of interest are H0:β=0 versus H1:β=β* (an assumed value for effect size ).

To estimate the ETE variance, Preisser et al. suggested using the following block exchangeable correlation structure for LCRTs:38

  1. a constant within-period correlation (between outcomes from different participants within the same cluster i during the same time-period t), CorrYij1t,Yij2t=α0 for j1j2;

  2. a constant inter-period correlation (outcomes from different participants within the same cluster i but across different time-periods), CorrYij1t1,Yij2t2=α1 for j1j2, and any t1,t2;

  3. a constant within-participant correlation (outcomes from the same participant j across different time-periods), CorrYijt1,Yijt2=α2 for t1t2;

Li et al. mentioned that the ETE variance is well-defined under the eigenvalue constraint, minλ1,λ2,λ3,λ4>0, where the distinct eigenvalues of the block exchangeable correlation structure are λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2, λ4=1+n-1α0+T-1n-1α1+T-1α2. 19 When α1<minα0,α2, we have λ2>λ1 and λ3>λ1. As such, if λ1>0, then the constraint, minλ1,λ2,λ3,λ4>0 is satisfied given that λ4>λ3. Below, we present an overview of the ETE variances from GEE analyses for individual LCRT designs.

2.1. Closed-cohort PA-LCRTs

Let π, the proportion of clusters receiving the intervention assignment, be a pre-determined value, e.g., 50%. The number of clusters in the intervention and control conditions are, mintv=mπ and mcont=m1-π, respectively. Table 1(a) illustrates a PA-LCRT design with m=20,T=6 and π=50%. To estimate the ETE variance for closed-cohort PA-LCRTs (as well as multiple-period CRXO trials; see below), Wang et al. assumed a more general correlation structure than Preisser et al. as follows:9

  1. a constant within-period correlation, CorrYij1t,Yij2t=α0 for j1j2;

  2. an inter-period correlation matrix, denoted by
    ϕ=ϕ11ϕ1TϕT1ϕTT,
    where ϕtt=α0, ϕt1t2=CorrYij1t1,Yij2t2 for j1j2, and any t1t2; and
  3. a within-participant correlation matrix, denoted by
    Ω=1ω1TωT11,
    where ωt1t2=CorrYijt1,Yijt2 for t1t2.

Using these assumptions and notation, the correlation structure is fully specified by ϕ and Ω. The correlation matrix for the vector of outcomes within each cluster is given by

Ri=In×nΩ-ϕ+1n×nϕ,

where ‘’ denotes the Kronecker product. For closed-cohort PA-LCRTs, Wang et al. derived the ETE variance as9

Var(β*)=σ2t1=1Tt2=1Tωt1t2+n-1ϕt1t2T2nmπ(1-π).

When both ϕ and Ω have a compound symmetry structure- for example, ϕt1t2=α1,ωt1t2=α2 for t1t2- this leads to a block exchangeable correlation structure and the above variance of the ETE simplifies to

Var(β*)=σ2λ4Tmnπ(1-π), (1.1)

where λ4 is the leading eigenvalue of the block exchangeable correlation structure. We note that Equation (1.1) is identical to the variance in the sample size formula (Eq. (27)) in Liu et al.39 and becomes a special case of the sample size formula (Eq. in Section 3.3.3) in Wang et al.40 Teerenstra et al. have also proposed a nested exchangeable correlation structure with two constant correlations of α1 and α28 for a PA-LCRT, which simplifies a block exchangeable correlation structure when α0=α1.

2.2. Closed-cohort multiple-period CRXO trials

Table 1(b) demonstrates a CRXO trial design with m=20,T=6 and π=50%. We assume no carryover effect or a washout period to minimize the carryover effect. Using a block exchangeable correlation structure, Liu et al. obtained the ETE variance for a T-period CRXO trial as

Var(β*)=σ2λ3Tmnπ(1-π). (2.1)

The proof was provided by Liu et al in their Appendix.14 When both ϕ and Ω from Section 2.1 have a compound symmetry structure, the ETE variance from Wang et al.9 is the same as Equation (2.1) for an even value of T.

2.3. Closed-cohort SW-CRTs

To estimate the ETE variance for SW-CRTs, an additional parameter is required: the number of steps. A step is defined as the time period when at least one cluster crosses over from control to intervention. The total number of steps is denoted by the number of treatment sequences L and each cluster i is allocated to a specific treatment sequence l=1,,L. Of note, the number of treatment sequences cannot be equal to or larger than the number of time-periods 2LT-1; otherwise, some clusters will not receive the intervention. The number of clusters that cross over at each step is summarized by hl such that l=1Lhl=m. Table 1(c) demonstrates a study design with m=20,T=6,L=4,andh1=h2=h3=h4=5.

Under the following assumptions:

  1. an equal number of clusters crossing over to intervention at each time-period hsh; and

  2. TL+1,

Liu et al.14 derived the approximate variance of ETE for closed-cohort SW-CRTs as

4σ2mn×32(L-1L)Tλ3λ4Lλ32+T-L2λ4. (3.1)

Of note, the denominator L-1LLλ32+T-L2λ4λ3-λ42L-T1-λ3λ42-Tλ4L is not a monotonic function of L given that λ3<λ4.

2.4. Repeated cross-sectional LCRTs

Different participants are enrolled at each time-period in repeated cross-sectional trials, in which the within-participant correlation α2 is no longer needed in the within-cluster correlation structure, therefore, the ETE variances of repeated cross-sectional longitudinal trials can be obtained by setting α2=α1 in the ETE variance formulae for closed-cohort PA-LCRTs, CRXO trials and SW-CRTs.19 For ease of reference, Table 2 summarizes all ETE variances for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs from Section 2.

Table 2.

ETE Variances from GEE analyses for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs

Design Var(β*) Note
Closed-cohort PA-LCRTs σ2λ4Tmnπ(1-π)1.1

Ref: Wang et al.
λ4=1+n-1α0+T-1n-1α1+T-1α2;

π: the proportion of clusters receiving the assignment of intervention.
Repeated cross-sectional PA-LCRTs σ2λ4RTmnπ(1-π)1.2 λ4R=1+n-1α0+T-1nα1.
Closed-cohort multiple-period CRXO trials σ2λ3Tmnπ(1-π)(2.1)

Ref: Liu et al.
λ3=1+n-1α0-α1-α2;

π: the proportion of clusters receiving the
assignment of IC in the first two periods.
Repeated cross-sectional multiple-period CRXO trials σ2λ3RTmnπ(1-π)(2.2)

Ref: Liu et al.
λ3R=1+n-1α0-nα1.
Closed-cohort SW-CRT 4σ2mn×32(L-1L)Tλ3λ4Lλ32+T-L2λ4(3.1)

Ref: Li et al.
λ3=1+n-1α0-α1-α2;
λ4=1+n-1α0+T-1n-1α1+T-1α2.
Repeated cross-sectional SW-CRT σ2mn×32(L-1L)Tλ3Rλ4RLλ3R2+T-L2λ4R (3.2)

Ref: Li et al.
λ3R=1+n-1α0-nα1;
λ4R=1+n-1α0+T-1nα1.

β^: ETE; σ2: outcome variance; α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; m: total number of clusters; n: cluster-period size (number of participants per cluster per period).

3. ODs with the lowest cost

To derive the ODs, we assume that the association parameters in the correlation structure are known or can be estimated based on routinely collected data. Methods and examples for obtaining intraclass correlation coefficient (ICC) estimates in LCRTs have been discussed in detail elsewhere.14 Based on marginal models estimated by GEE, previous researchers have shown that β^Varβ^ is approximately normally distributed in CRTs when the number of clusters is sufficiently large.8,9,19 It is noteworthy to mention that asymptotic normality may not work when the number of clusters is small. Using the ETE variance, the required sample size can be calculated to meet a certain power requirement, e.g., 80%, at a specific type I error rate, e.g., 5%. We will assume a type I error of 5% in all remaining examples. If the cluster-period size n is known, then the number of clusters m can be calculated based on the ETE variance for a pre-determined number of time-periods T. Similarly, when the number of clusters m is known, the cluster-period size n can be similarly calculated. Of note, Hemming et al. showed that the desired power may not be achieved in the latter scenario.41

Following the cost-efficiency framework in Liu and Li,14 we assume that the cost per cluster recruitment is c currency units (e.g., $US), the cost per participant enrollment is s currency units, and the cost per outcome measurement is e currency units. Thus, the total cost is calculated as TC=mc+sn+eTn in closed-cohort trials and TC=mc+s+eTn in repeated cross-sectional trials.

We refer to the enrollment feasibility of a cluster-period size, e.g., 2,nmax and number of clusters, e.g., (2,mmax), as the design space. We define the OD as the design within the design space with:

  • the lowest cost in the design space that obtains a desired level of power, e.g., 80%; or

  • the lowest ETE variance, equivalent to the highest power, in the design space given a fixed budget.26,4244

In the next sub-sections, we present a general algorithm for identifying the OD with the lowest cost per trial design. We also determine the optimal number of treatment sequences L in SW-CRTs, and then we use this value to identify the OD for all six study designs of interest.

3.1. OD algorithms with the lowest cost for LCRTs

First, the OD algorithm for closed-cohort PA-LCRTs is proposed as follows:

Step 1. Specify desired levels of type I error and power, and the design space (2,nmax) and 2,mmax.

Step 2. For each integer value of n in the design space,

  1. calculate eigenvalues, λ1,λ2,λ3,λ4 and check the minimum minλ1,λ2,λ3,λ4>0, where λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2, λ4=1+n-1α0+T-1n-1α1+T-1α2;

  2. calculate the smallest m to obtain the desired level of power at the given type I error using the variance (1.1) and check that mmmax;

  3. if conditions 1) and 2) are satisfied, calculate the total cost as mc+sn+eTn.

Step 3. Select the design with the smallest total cost in the design space, nOD,mOD.

Similarly, we propose OD algorithms with the lowest cost for multiple-period CRXO trials and SW-CRTs. Algorithms for the six LCRT designs are detailed in Table 3. Of note, for closed-cohort CRTs the cluster size is the same as the cluster-period size n; whereas for repeated cross-sectional CRTs, the actual cluster size over all periods is n×T, as different participants are included in each distinct time period. Therefore, for closed-cohort CRTs, the total sample size (distinct number of participants) can be considered as N=m×n; whereas for repeated cross-sectional CRTs, it is N=m×n×T.

Table 3.

Optimal designs with the lowest cost for PA-LCRTs, multiple-period CRXO trials and SW-CRTs

Design Algorithm for integer estimates
Common, unless otherwise specified Step 1. Specify desired levels of type I error and power, and the design space (2,nmax) and 2,mmax.
Step 2. For each integer value of n in the design space,
 1) calculate eigenvalues and check that the minimum of eigenvalues >0;
 2) calculate the smallest m to obtain the desired level of power at the given type I error using the variance and check that mmmax;
 3) if conditions 1) and 2) are satisfied, calculate the total cost.
Step 3. Select the design with the smallest total cost in the design space.
Closed-cohort PA-LCRTs 1) eigenvalues λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2, λ4=1+n-1α0+T-1n-1α1+T-1α2;
2) variance (1.1);
3) total cost mc+sn+eTn
Repeated cross-sectional PA-LCRTs 1) eigenvalues λ1R=1-α0, λ3R=1+n-1α0-nα1, λ4R=1+n-1α0+T-1nα1;
2) variance (1.2);
3) total cost mc+s+eTn
Closed-cohort multiple-period CRXO trials 1) eigenvalues λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2, λ4=1+n-1α0+T-1n-1α1+T-1α2;
2) variance (2.1);
3) total cost mc+sn+eTn
Repeated cross-sectional multiple-period CRXO trials 1) eigenvalues λ1R=1-α0, λ3R=1+n-1α0-nα1, λ4R=1+n-1α0+T-1nα1;
2) variance (2.2);
3) total cost mc+s+eTn
Closed-cohort SW-CRT Step 1. the design space (2,nmax) and L,mmax.
Step 2.
 1) eigenvalues λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2, λ4=1+n-1α0+T-1n-1α1+T-1α2;
 2) variance (3.1);
 3) total cost mc+sn+eTn
Repeated cross-sectional SW-CRT Step 1. the design space (2,nmax) and L,mmax.
Step 2.
 1) eigenvalues λ1R=1-α0,λ3R=1+n-1α0-nα1, λ4R=1+n-1α0+T-1nα1;
 2) variance (3.2);
 3) total cost mc+s+eTn

α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; n: cluster-period size (number of participant s per cluster per period); nmax: maximum of cluster-period size; m: total number of clusters; mmax: maximum of number of clusters; c: cost per cluster; s: cost per participant; for closed-cohort LCRTs, e: cost per time-period.

3.2. Choosing the number of treatment sequences in SW-CRTs

To ensure equitable comparisons across study designs, we next determine the optimal number of treatment sequences L in SW-CRTs. For a pre-determined value of T,L can be any integer between 2 and T-1, in theory (allowing for a different number of clusters to be randomized to each unique sequence). We start by examining ODs for values of T=4. Table 4 details the calculated total cost TCOD, number of clusters mOD, cluster-period size nOD, and total sample size NOD under the OD for known association parameters α0,α1,α2 in closed-cohort and repeated cross-sectional SW-CRTs, assuming a type I error of 5%, a power of 80%, a treatment effect of β=0.2,σ2=1,c=3000,s=200,ande=50. For example, for a closed-cohort SW-CRT with α0,α1,α2=(0.05,0.020,0.2) and L=2, the OD has a total cost of 684.0k, a total number of clusters of 76, a cluster-period size of 15, and a total sample size of 1140, whereas for a similar trial with L=3, these values are 418.2k, 51, 13, and 663, respectively. As a further example, for a repeated cross-sectional SW-CRT with α0,α1=(0.05,0.020) and L=2, the OD has a total cost of 1408.0k, a total number of clusters of 128, a cluster-period size of 8, and a total sample size of 4096, whereas for a similar trial with L=3, these values are 840.0k, 84, 7, and 2352, respectively.

Table 4.

Optimal designs with the lowest cost to obtain 80% power for SW-CRTs with T=4 for known association parameter α0,α1,α2

Association Parameter α0,α1,α2 Closed-cohort Repeated cross-sectional
L=2 L=3 L=2 L=3
Cost m n N Cost m n N Cost m n N Cost m n N
(0.05, 0.020, 0.2) 684.0 76 15 1140 418.2 51 13 663 1408.0 128 8 4096 840.0 84 7 2352
(0.05, 0.020, 0.6) 471.2 76 8 608 304.2 39 12 468
(0.05, 0.020, 0.8) 322.4 52 8 416 217.8 33 9 297
(0.05, 0.040, 0.2) 536.0 40 26 1040 349.8 33 19 627 1254.0 66 16 4224 780.0 60 10 2400
(0.05, 0.040, 0.6) 330.0 30 20 600 216.0 24 15 360
(0.05, 0.040, 0.8) 207.2 28 11 308 140.4 18 12 216
(0.10, 0.020, 0.2) 950.4 144 9 1296 574.2 87 9 783 1744.0 218 5 4360 1008.0 126 5 2520
(0.10, 0.020, 0.6) 702.0 130 6 780 452.4 78 7 546
(0.10, 0.020, 0.8) 512.4 122 3 366 340.2 81 3 243
(0.10, 0.040, 0.2) 880.4 142 8 1136 546.0 78 10 780 1674.0 186 6 4464 999.0 111 6 2664
(0.10, 0.040, 0.6) 626.4 108 7 756 405.0 75 6 450
(0.10, 0.040, 0.8) 450.0 90 5 450 300.0 60 5 300

treatment effect=0.2, outcome variance=1; α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; cost per cluster c=3000; for closed-cohort LCRTs, cost per participant s=200; cost per time-period e=50; totalcost=mc+sn+eTn; for repeated cross-sectional LCRTs, cost per participant s=250; totalcost=mc+s+eTn; unit in the column of Cost (k); m: total number of clusters; mmax=5000; n: cluster-period size (number of participants per cluster per period); nmax=5000; N: total sample size. only α0,α1 is needed for repeated cross-sectional trials.

Figure 1 present the required TCOD,mOD,nOD,NOD for various closed-cohort SW-CRTs. Each plot includes multiple lines for different values of L and correlation α0,α1,α2. Within each plot, we hold two out of the three correlation values constant and vary the third. For example, α1,α2=(0.020,0.6) in Figure 1A1D, α0,α2=(0.05,0.6) in Figure 1E1H, and α0,α1=(0.05,0.020) in Figure 1I1L. The total cost, optimal number of clusters, and total sample size decrease with increasing L; however, as L varies, no patterns are observed for the optimal cluster-period size. In addition, as α0 increases, the total cost, optimal number of clusters, and total sample size generally increase, whereas the optimal cluster-period size generally decreases for specific values of L. In contrast, as α1 increases, the total cost, optimal number of clusters, and total sample size generally decrease, whereas the optimal cluster-period size generally increases for specific values of L. As α2 increases, the total cost, optimal number of clusters, optimal cluster-period size, and total sample size generally decrease for specific values of L. Figure 2 present the same information as above, but for repeated cross-sectional SW-CRTs. These figures illustrate that the same conclusions can be made for repeated cross-sectional SW-CRTs as for closed-cohort SW-CRTs.

Figure 1.

Figure 1

Figure 1

Optimal designs with the lowest cost and 80% power to detect a treatment effect size of 0.2 and variance of σ2=1 for known association parameter α0,α1,α2 assuming cost per cluster c=3000, cost per participant s=200, and cost per time-period e=50 in closed-cohort SW-CRTs with T=4

Figure 2.

Figure 2

Figure 2

Optimal designs with the lowest cost and 80% power to detect a treatment effect size of 0.2 and variance of σ2=1 for known association parameter α0,α1 assuming cost per cluster c=3000, and cost per participant s=250 in repeated cross-sectional SW-CRTs with T=4

Extending these analyses to additional values of T (See Supplementary Figures 110), we see that the optimal number of clusters and total sample size decrease with increasing L and reach a minimum at L=T-1 for T6. When T>6, these values are not at a minimum for L=T-1, but they are often very close to the minimum. Therefore, we recommend using L=T-1 in SW-CRTs to yield the lowest cost. We also use this value in the remaining subsections when we compare optimal sample size calculations across the six designs of interest.

3.3. Comparing ODs with the lowest cost across six LCRTs

In this subsection, we focus our analysis on T4 because: 1) SW-CRTs require that 2LT-1, so T must be ≥3; and 2) Equation (2.1) requires an even number of T, making 4 the minimum value of T that can be used for all LCRTs. We also limit our investigation to T12 because T is generally a relatively small number in practice. Specifically, in a review of 160 published SW-CRTs between 2016 and 2022, the interquartile range of the number of sequences is (4, 7), which is included in our choice of design parameters. This subsection assumes the same desired levels of type I error and power, treatment effect, and unit costs as in Section 3.2.

3.3.1. Comparing ODs at a fixed value of T

Table 5 details the required TCOD,mOD,nOD,NOD for known association parameters α0,α1,α2 and all six designs at T=4. For example, for α0,α1,α2=(0.05,0.020,0.2), the OD for a closed-cohort PA-LCRT has a total cost of 358.8k, a total number of clusters of 46, a cluster-period size of 12, and a total sample size of 552, whereas for a repeated cross-sectional PA-LCRT, these values are 480.0k, 60, 5, and 1200, respectively. For a closed-cohort CRXO trial, the OD has a total cost of 144.0k, a total number of clusters of 16, a cluster-period size of 15, and a total sample size of 240, whereas for a repeated cross-sectional CRXO trial, these values are 330.0k, 22, 12, and 1056, respectively. Finally, for a closed-cohort SW-CRT, the OD has a total cost of 418.2k, a total number of clusters of 51, a cluster-period size of 13, and a total sample size of 663, whereas for a repeated cross-sectional SW-CRT, these values are 840.0k, 84, 7, and 2352, respectively.

Table 5.

Optimal designs with the lowest cost to obtain 80% power for six designs with T=4 for known association parameter α0,α1,α2

Association Parameter α0,α1,α2 PA-LCRTs Multiple-period CRXO trials SW-CRTs (L=3)
Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional
Cost m n N Cost m n N Cost m n N Cost m n N Cost m n N Cost m n N
(0.05, 0.020, 0.2) 358.8 46 12 552 480.0 60 5 1200 144.0 16 15 240 330.0 22 12 1056 418.2 51 13 663 840.0 84 7 2352
(0.05, 0.020, 0.6) 514.8 66 12 792 92.4 14 9 126 304.2 39 12 468
(0.05, 0.020, 0.8) 582.8 62 16 992 64.8 12 6 72 217.8 33 9 297
(0.05, 0.040, 0.2) 429.2 74 7 518 560.0 80 4 1280 107.2 8 26 208 264.0 12 19 912 349.8 33 19 627 780.0 60 10 2400
(0.05, 0.040, 0.6) 602.0 86 10 860 63.6 6 19 114 216.0 24 15 360
(0.05, 0.040, 0.8) 680.8 92 11 1012 42.0 6 10 60 140.4 18 12 216

treatment effect=0.2, outcome variance=1; α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; cost per cluster c=3000; for LCRTs, cost per participant s=200; cost per time-period e=50; totalcost=mc+sn+eTn; for repeated cross-sectional LCRTs, cost per participant s=250; totalcost=mc+s+eTn; unit in the column of Cost (k); m: total number of clusters; mmax=5000; n: cluster-period size (number of participants per cluster per period); nmax=5000; N: total sample size. only α0,α1 is needed for repeated cross-sectional trials.

Figure 3 present the ODs with the lowest cost under T=4 for varying correlation values. The estimates, TCOD,mOD,nOD,NOD, from optimal repeated cross-sectional PA-LCRTs, CRXO trials and SW-CRTs are constant (Figure 3I3L) because α1andα2 are constant and equal to each other.

Figure 3.

Figure 3

Figure 3

Optimal designs with the lowest cost and 80% power to detect a treatment effect size of 0.2 and variance of σ2=1 for known association parameter α0,α1,α2 assuming cost per cluster c=3000, cost per participant s=200, and cost per time-period e=50 across six designs with T=4

3.3.1.1. Comparing total cost under ODs at a fixed value of T=4

As α0 increases, we observe that the total cost is increasing for the six designs, whereas as α1 or α2 increases, the total cost increase for optimal PA-LCRTs only.

Comparing the OD for different trial designs, the trial design with the lowest cost is an optimal closed-cohort CRXO trial, followed by an optimal repeated cross-sectional CRXO trial and an optimal closed-cohort SW-CRT. Optimal repeated cross-sectional SW-CRTs have a highest total cost than the other trial designs. Comparing the OD for CRXO trials and SW-CRTs, optimal closed-cohort designs always require a lower cost than repeated cross-sectional trials. In contrast, optimal closed-cohort PA-LCRTs may require a higher cost than optimal repeated cross-sectional PA-LCRTs.

3.3.1.2. Comparing optimal number of clusters and cluster-period size at a fixed value of T=4

In addition to the total cost, the number of clusters and the cluster-period size are also important considerations, with the number of clusters generally playing a more important role than cluster-period size. As α0 increases, we observe that the optimal number of clusters is generally increasing, whereas as α1 or α2 increases, the number of clusters generally increases for optimal PA-LCRTs only. Optimal cluster-period size is generally decreasing as α0 increases and increasing as α1 increases. As α2 increases, it increases for optimal closed-cohort PA-LCRTs only.

Comparing the OD for different trial designs, optimal closed-cohort CRXO trials require the smallest number of clusters, followed by optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs. Optimal repeated cross-sectional SW-CRTs sometimes return the largest number of clusters. Comparing the OD for CRXO trials and SW-CRTs, optimal closed-cohort designs require a much smaller number of clusters than repeated cross-sectional trials, given the same values of other design parameters.

3.3.1.3. Comparing the total sample size under ODs at a fixed value of T=4

A final feasibility consideration is the total sample size. As α0 increases, we observe that the total sample size is generally non-decreasing, whereas as α2 increases, the total sample size generally increases for optimal PA-LCRTs. As α1 increases, total sample size does not vary much for all six ODs.

Comparing the ODs for different trial designs, the trial design with the smallest total sample size is an optimal closed-cohort CRXO trial and the trial design with the largest total sample size is an optimal repeated cross-sectional SW-CRT. Optimal closed-cohort designs require a much smaller total sample size than repeated cross-sectional trials.

3.3.2. Comparing ODs at different values of T

Table 6 presents the required TCOD,mOD,nOD,NOD for known association parameter α0=0.05,α1=0.020,α2=0.6 across all six designs at different values of T and Figure 2A2D visualize these results. As T increases, the total costs for optimal closed-cohort PA-LCRTs generally increases. In contrast, the opposite pattern is observed for the other five ODs. Overall, optimal closed-cohort CRXO trials, optimal closed-cohort SW-CRTs, and optimal repeated cross-sectional CRXO have lower costs than the other ODs. Optimal closed-cohort CRXO trials require the smallest number of clusters, followed by optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs.

Table 6.

Optimal designs with the lowest cost to obtain 80% power for six designs with α0=0.05, α1=0.020 and α2=0.6

T PA-LCRTs Multiple-period CRXO trials SW-CRTs (L=T1)
Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional
Cost m n N Cost m n N Cost m n N Cost m n N Cost m n N Cost m n N
4 514.8 66 12 792 480.0 60 5 1200 92.4 14 9 126 330.0 22 12 1056 304.2 39 12 468 840.0 84 7 2352
6 558.0 62 12 744 465.0 62 3 1116 70.0 10 8 80 297.0 18 9 972 210.0 30 8 240 675.0 45 8 2160
8 612.0 60 12 720 450.0 50 3 1200 61.2 6 12 72 286.0 22 5 880 176.4 21 9 189 616.0 56 4 1792
10 669.6 72 9 648 448.0 56 2 1120 51.6 6 8 48 276.0 12 8 960 154.8 18 8 144 585.0 45 4 1800
12 726.0 66 10 660 450.0 50 2 1200 46.8 6 6 36 270.0 10 8 960 147.4 11 13 143 594.0 33 5 1980

treatment effect=0.2, outcome variance=1; α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; cost per cluster c=3000; for closed-cohort LCRTs, cost per participant s=200; cost per time-period e=50; totalcost=mc+sn+eTn; for repeated cross-sectional LCRTs, cost per participant s=250; totalcost=mc+s+eTn; unit in the column of Cost (k); m: total number of clusters; mmax=5000; n: cluster-period size (number of participants per cluster per period); nmax=5000; N: total sample size. only α0,α1 is needed for repeated cross-sectional trials.

3.4. Summary

Considering all parameters of interest – TC, number of clusters, cluster period size, and total sample size, our numerical studies find that the optimal closed-cohort CRXO trials perform the best across the six ODs. However, depending on whether TC or other factors related to enrollment feasibility are more important (e.g., number of clusters), we also recommend optimal repeated cross-sectional CRXO trials and closed-cohort SW-CRTs as strong alternatives. In Appendix 1, we summarize and propose the ODs with the highest power under a fixed budget for each of the six designs and then compare ODs across these designs. We make the same conclusion: the optimal closed-cohort CRXO trials performs the best, followed by the optimal repeated cross-sectional CRXO trials. We also notice that the integer estimates from OD algorithms are very close to decimal estimates from OD formula.

4. An example

This section re-designs the CRTs for the Prevention of Suicide in Primary Care Elderly: Collaborative Trial (PROSPECT), where the unit of randomization is primary care practice and all patients within a primary care practice receive the same assignment (either intervention or usual care).45 The trial aims to determine the effect of a primary care intervention on suicidal ideation and depression in older patients and collects the depression severity at baseline, 4 months, 8 months, and 12 months using 24-item Hamilton Depression Rating Scale (HDRS). We consider T=4 and use the same parameter settings as in Wang et al.,9 with a treatment effect of 1, a standard deviation of 6, and α0,α1,α2=0.03,0.015,0.3 in addition to the assumptions of the unit costs c=3000,s=200, and e=50.

First, we identify an optimal design with the lowest cost to obtain at least 80% power at the two-sided significance level of 5%. The number of primary care practices m=56, cluster-period size (number of patients per primary care practice) n=15, and the total cost is 504k for an optimal closed-cohort PA-LCRT; m=68,n=6, and the total cost is 612k for an optimal repeated cross-sectional PA-LCRT; m=14,n=20, and the total cost is 154k for an optimal closed-cohort CRXO trial; m=24,n=14, and the total cost is 408k for an optimal repeated cross-sectional CRXO trial; m=48,n=17, and the total cost is 470.4k for an optimal closed-cohort SW-CRT and m=72,n=12, and the total cost is 1080k for an optimal repeated cross-sectional SW-CRT. Therefore, the optimal closed-cohort CRXO trial, followed by the optimal repeated cross-sectional CRXO trial and optimal closed-cohort SW-CRT, has the lowest cost.

Second, we consider B=408k as a given total budget and aim to identify the optimal design with the highest power. An optimal closed-cohort PA-LCRT returns m=52,n=12 with 71.3% power and an actual cost of 405.6k; an optimal repeated cross-sectional PA-LCRT returns m=40,n=7 with 62.6% power and an actual cost of 400k; an optimal closed-cohort CRXO trial returns m=40,n=18 with 99.6% power and an actual cost of 408k; an optimal repeated cross-sectional CRXO trial returns m=24,n=14 with 80.3% power and an actual cost of 408k; an optimal closed-cohort SW-CRT returns m=45,n=15 with 74.0% power and an actual cost of 405k; and an optimal repeated cross-sectional SW-CRT returns m=27,n=12 with 40.7% power and an actual cost of 405k. The optimal closed-cohort CRXO trial is the best, with the highest power within the budget, followed by the optimal repeated cross-sectional CRXO trial and optimal closed-cohort SW-CRT.

5. Discussion

In this paper, we discuss the OD for six possible LCRT designs: PA-LCRTs, multiple-period CRXO trials, and SW-CRTs including closed-cohort and repeated cross-sectional designs with two treatment conditions and continuous outcomes. Sample size formulae for these trials and ODs for some of these trials have been developed separately in prior research; however, research comparing ODs across trial designs is limited. Our contributions to the optimal CRT design literature are several folded. First, we propose algorithms to identify the ODs with the lowest cost for desired levels of power for all six study designs. For SW-CRTs, in particular, we suggest using the number of treatment sequences, L=T-1, where T is the number of time-periods, to yield the lowest cost. Second, under a fixed budget, we summarize and propose the ODs with the highest power for each of the six study designs and provide the ETE variance formulae under the OD for PA-LCRTs and multiple-period CRXO trials. Third, our findings suggest that optimal closed-cohort CRXO trials have the lowest cost, followed by optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs. In fact, we confirm that optimal closed-cohort CRXO trials excelled in both ODs with different optimization metrics, and are global ODs. Fourth, given the suggestion that L=T-1 in both types of SW-CRTs under the OD with the highest power,14 we compare the six ODs and find that 1) optimal closed-cohort CRXO trials perform the best with the highest power; 2) optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs are the next two best designs with highest power; and 3) decimal estimates show a more obvious trend than integer estimates for PA-LCRTs and CRXO trials. Finally, it is important to notice that, while our development is based on GEE, our results should be equally applicable to trials analyzed by linear mixed models with matching random-effects structures as we have primary focused on a continuous outcome. For example, the sample size formulae have been shown to be identical for GEE and linear mixed models with comparable correlation structures for repeated cross-sectional and closed-cohort SW-CRTs (see, for example, Appendix D in Chen e al.).46

There are several limitations to this research. First, we only discussed closed-cohort and repeated cross-sectional design as well as assume complete measurements from each participant. Open cohort designs, hybrid designs, continuous recruitment continuous exposure designs and incomplete designs were not included and could be important future work.7,4750. Second, our comparisons of ODs are based on locally optimal designs under an assumed set of ICC values. In general, when historical or routinely collected data are available during the study planning stage, one may be able to fit the GEEs or linear mixed models to compute the ICC values, based on methods discussed in Zhang et al. and Ouyang et al.51,52 Without such preliminary data, a common practice is to elicit the ICC values from the published literature. Specifically in the context of longitudinal CRTs, Korevaar et al. reported and summarized the ICC estimates from a collection of completed trials—the CLustered OUtcome Dataset bank—for a wide range of outcomes to inform future study planning.53 Addressing uncertainty in the ICC values for study planning, such as using the MaxiMin optimal design, has been pursued in Liu and Li for multiple-period CRXO trials and SW-CRTs, and a full comparison of MaxiMin optimal designs deserves future research.14 On the other hand, the cost parameters are also required input for our methods and are generally based on the context, type of clusters, and study budget. Precise cost estimates for recruiting a cluster, an individual participant, and for obtaining an outcome measurement would require collaboration between study investigators, health economists, and statisticians, and best practices for doing merit continued discussion in the context of longitudinal CRTs. Third, while our work addresses the basic setting under a constant treatment effect assumption, it is possible that the treatment effect may depend on the calendar time or exposure time across six LCRTs.5456 Our GEE model cannot address these complicated concerns and we will defer the development of ODs under a more complicated treatment effect structure to future research. Fourth, for SW-CRTs, for example, a SW-CRT with T=6 and 3 treatment sequences, Section 3.2 only addressed treatment sequences of CIIIII, CCIIII, and CCCIII, where the number of intervention time-periods is 5, 4, and 3, respectively. However, alternative treatment sequence such as “CIIIII, CCCIII, CCCCCI” could also be an option, where the number of intervention time-periods is 5, 3, and 1, respectively. From the trial perspective, the investigator aims to maximize the time-periods of intervention. Thus, it should be reasonable to consider our proposed scenario. Given that this research restricts to these SW-CRTs when comparing across 6 designs, further efficiency improvement may be made when we allow for an additional layer of optimization across treatment sequences, such as those in Lawrie et al, and Li et al.57,58 However, that type of optimization along with a local optimalization under a total budget can introduce additional computational challenges. Whether such additional maximization can substantially improve SW-CRTs is an open question for future research. We also assume that an equal number of clusters crossover to the intervention at each time-period for SW-CRTs. However, previous work has shown that equal allocation of clusters to sequences in stepped wedge designs is not optimal;5759 Recently, Watson et al. attempted more extensive ODs for CRTs through identifying three broad classes of methods and combining these algorithms to select an optimal subset of cluster sequences.60 They determined the optimal allocation of clusters across a set of cluster sequences and the optimal cluster-period size for both Gaussian and non-Gaussian models using exchangeable and exponential decay covariance structures. It would therefore be interesting to combine considerations for treatment sequence optimization and for cost-efficiency in future work. Fifth, the proportion of clusters receiving the intervention assignment π=50% in PA-CRTs and CRXO designs, and equal unit costs between two treatment conditions are assumed. For example, assuming that the cost per cluster recruitment is cq currency units, the cost per participant enrollment is sq currency units, and the cost per outcome measurement is eq currency units in treatment condition q,q=0,1 for closed-cohort trials, the total cost is TC=mcontc0+s0n+e0Tn+mtrtc1+s1n+e1Tn. Therefore, future work could explore OD algorithms under alternative assumptions and address between-arm heterogeneity.

We reiterate that our results are developed for selecting optimal longitudinal cluster randomized designs with a continuous outcome, and they may not directly generalize to binary or count outcomes. Previously, Liu et al. considered count data in two-/three-level PA-CRTs and reviewed the optimal designs of two-/three-level PA-CRTs with a binary outcome utilizing GEEs.14,33,6163 However, when the design matrix becomes more complicated (as in SW-CRTs), the variance of the treatment effect estimator is often expressed in matrix form, and no simple, scalar variance expression exists. With binary outcomes, Li et al. derived a matrix-based formula for sample size calculations in SW-CRTs, specifically leveraging an analytic inverse of the block exchangeable correlation matrix to facilitate efficient numerical computation of the variance.19 Building on that approach, future research could explore numerical methods, such as those in Li et al., 19 to identify ODs for SW-CRTs and extend our design comparison results to accommodate binary and count outcomes.

In conclusion, this paper fills an important gap in the current OD literature for six possible LCRT designs by discussing two different types of ODs, one with the lowest cost to obtain at a desired level of power and the other with the highest power given a fixed budget. We compare the ODs across these six designs and conclude that the optimal closed-cohort CRXO trial is the theoretically global OD. However, in practice, a CRXO design may not always be feasible depending on the intervention and study context. For example, when an intervention is difficult to de-implement, the investigators must choose SW-CRT designs rather than CRXO designs even if our findings shows that an optimal CRXO design is more efficient than an optimal SW-CRT design. Hence, in practice, it is possible that an OD under SW-CRT would be more appropriate and appealing and should remain a compelling design option.

Supplementary Material

Supplementary Figures-1
Supplementary Figures-2

Figure 4.

Figure 4

Optimal designs with the lowest cost and 80% power to detect a treatment effect size of 0.2 and variance of σ2=1 for known association parameter α0=0.05,α1=0.020,α2=0.6 assuming cost per cluster c=3000, cost per participant s=200, and cost per time-period e=50 across six designs

Acknowledgements

We thank the Alvin J. Siteman Cancer Center at Washington University School of Medicine and Barnes-Jewish Hospital in St. Louis, MO (P30 CA91842), National Institutes of Health (NIH) grant P50 CA244431, Institute of Clinical and Translational Sciences (ICTS) grant CTRFP2019-05, and a Patient-Centered Outcomes Research Institute Award® (PCORI® Award ME-2022C2-27676) for supporting this research. The content is solely the responsibility of the authors and does not necessarily represent the official view of the NIH, PCORI®, or its Board of Governors or Methodology Committee.

Appendix. ODs with the highest power

Here, we focus on ODs with the lowest ETE variance and thus the highest power. In addition to considering design estimates based on integers (i.e. whole humans), which provide investigators with the optimal practical design, we also introduce theoretical ODs at an exact budget B. In closed-cohort LCRTs,

B=mc+sn+eTn. (4)

In repeated cross-sectional LCRTs,

B=mc+sTn. (5)

The theoretical ODs allow decimal values for {n,m} at an exact budget B with Equation (4) or (5), whereas the practical ODs with integer values for {n,m} have a total cost less than or equal to budget B. Therefore, we can investigate the performance of practical ODs by comparing them with theoretical ODs.

1. OD formulae with the highest power for LCRTs

Following the proof in Liu et al.,14 the OD formulae for closed-cohort PA-LCRTs using ETE variance (1.1) is as follows

nOD=ϑcs+eT,mOD=Bϑs+eTc+c, (6)

where ϑ=1+T-1α2α0+T-1α1-1. Recall that λ4=1+n-1α0+T-1n-1α1+T-1α2=λ21+nϑ, where λ2=1-α0-T-1α1-α2 and ϑ=1+T-1α2α0+T-1α1-1. Then the variance of the treatment effect estimator can be written as

Var(β*)=σ2λ4Tnmπ(1-π)=σ2λ21+nϑTπ(1-π)nm.

Under the OD,

nODmOD=Bs+eTs+eT+cϑ

and thus the ETE variance of this OD is

Var(β*)OD=σ2λ4Tmnπ1-π=σ2λ21+1ϑϑcs+eTTπ1-πBs+eTs+eT+cϑ=σ2λ2Tπ1-πB×s+eT+cϑ2.

Liu et al. also derived the following OD formulae for closed-cohort CRXO designs when T is a predetermined value.14 It is the same as Equation (6) but with a different ϑ=1-α2α0-α1-1. The ETE variance of this OD is

Var(β*)OD=σ2λ1Tπ1-πB×s+eT+cϑ2.

This proof is provided in Appendix 2 of Liu et al.14

As described earlier in section 2.4, we set α2=α1 in repeated cross-sectional trials. Therefore, Equation (6) becomes nOD=ϑcTs,mOD=BϑTsc+c, where ϑ=1-α0α0+T-1α1 and ϑ=1-α0α0-α1 for repeated cross-sectional PA-LCRTs and CRXO trials, respectively. As α0 increases, power is non-increasing for optimal PA-LCRTs and CRXO trials, whereas as α1 increases, power is decreasing for optimal PA-LCRTs. Finally, as α2 increases, power decreases for optimal closed-cohort PA-LCRTs but increases for optimal closed-cohort CRXO trials. OD formulae are not available for closed-cohort SW-CRTs because we cannot derive the closed-form formulae (e.g., Equation (6)) using variance (3.1). We can only derive it for PA-LCRTs and CRXO trials.

2. OD algorithms with the highest power for LCRTs

Although cluster-period size and the number of clusters must be integer values in practice, the expressions in Equation (6) can yield non-integer values. To obtain integer estimates, we propose the following OD algorithm for closed-cohort PA-LCRTs:

Step 1. Specify the design space (2,nmax) and 2,mmax.

Step 2. For each combination of integer values of n and m in the design space,

  1. calculate the eigenvalues, λ1,λ2,λ3,λ4 and check that the minimum minλ1,λ2,λ3,λ4>0, where λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2, λ4=1+n-1α0+T-1n-1α1+T-1α2;

  2. calculate the total cost as mc+sn+eTn and check that it is less than or equal to the budget B;

  3. if conditions 1) and 2) are satisfied, calculate the variance (1.1).

Step 3. Select the design with the smallest variance in the design space, nOD,mOD.

Liu et al. have proposed an OD algorithm to obtain integer estimates in CRXO trials and SW-CRTs.14 They are similar to the above algorithm expect that the ETE variance is different, where (1.1) in Step 2–3 is replaced by (2.1) and (3.1). The OD algorithms can also be applied to repeated cross-sectional trials. For closed-cohort and repeated cross-sectional SW-CRTs, Liu et al. showed that the optimal number of time-periods T is equal to the number of treatment sequences L plus 1.14

Appendix Table 1 summarizes all proposed ODs including the algorithms for estimating integer values and the formulae for estimating decimal values for PA-LCRTs, multiple-period CRXO trials and SW-CRTs.

3. Comparing ODs with the highest power across six LCRTs

For comparison across the six ODs with the highest power, we assume the same type I error, treatment effect, and unit costs as those described in Section 3.1 and a budget B=300,000 for known association parameter α0,α1,α2.

3.1. Comparing ODs at a fixed value of T

Appendix Table 2 details the obtained power, number of clusters mOD, optimal cluster-period size nOD, and total sample size NOD under the OD with the highest power for all six ODs assuming T=4. For example, for α0,α1,α2=(0.05,0.020,0.6), the OD with the highest power for a closed-cohort PA-LCRT has an obtained power of 0.569, number of clusters of 38, cluster-period size of 12, and total sample size of 456. These values are 0.599, 30, 7, and 840, respectively, for a repeated cross-sectional PA-LCRT; 0.999, 48, 8, and 384, respectively, for a closed-cohort CRXO trial; 0.773, 20, 12, and 960, respectively, for a repeated cross-sectional CRXO trial; 0.796, 45, 9, and 405, respectively, for a closed-cohort SW-CRT; 0.390, 30, 7, and 840, respectively, for a repeated cross-sectional SW-CRT.

Appendix Figure 1 present the same estimates as provided in Appendix Table 2 but allow the correlation values to vary. Specifically, they keep two of the three correlation values α0,α1,α2 constant and allow the remaining correlation to increase. Appendix Figures 1I1L present horizontal lines for repeated cross-sectional CRXO trials and SW-CRTs.

3.1.1. Comparing power under ODs at a fixed value of T=4

As α0 increases, power is non-increasing for six ODs, whereas as α2 increases, power increases for optimal closed-cohort SW-CRTs.

Based on these figures, we can see clearly that optimal closed-cohort CRXO trials have the largest power, followed by optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs.

3.1.2. Comparing optimal number of clusters and cluster-period size at a fixed value of T=4

With respect to the optimal number of clusters, closed-cohort CRXO trials have one of the largest numbers, and repeated cross-sectional CRXO trials have one of the smallest numbers.

As α0 increases, the number of clusters are non-decreasing, whereas the cluster-period size is non-increasing for all six ODs. As α1 increases, 1) the optimal number of clusters is non-decreasing and the optimal cluster-period size is non-increasing for PA-LCRTs; and 2) the optimal number of clusters is non-increasing and the optimal cluster-period size is non-decreasing for CRXO trials and SW-CRTs. Finally, as α2 increases, 1) the optimal number of clusters is non-increasing and the optimal cluster-period size is non-decreasing for closed-cohort PA-LCRTs; and 2) the optimal number of clusters is non-decreasing and the optimal cluster-period size is non-increasing for closed-cohort CRXO trials and SW-CRTs.

3.1.3. Comparing the total sample size under ODs at a fixed value of T=4

Repeated cross-sectional CRXO trials and SW-CRTs have the largest total sample sizes. As α0 increases, the total sample sizes are non-increasing for all six ODs. As α1 increases, the total sample size is 1) non-increasing for PA-LCRTs; and 2) non-decreasing for CRXO trials and SW-CRTs. As α2 increases, the total sample size is 1) non-decreasing for closed-cohort PA-LCRTs; and 2) non-increasing for closed-cohort CRXO trials and SW-CRTs.

3.2. Comparing ODs at different values of T

Appendix Table 3 extends the above analyses by varying values of T, while holding the correlation parameters constant at α0=0.05,α1=0.020,α2=0.6 for all six ODs. Appendix Figure 2 presents the corresponding results visually.

As T increases, we see that: 1) power decreases for two of the PA-LCRTs and increases for the other four designs. 2) the optimal number of clusters generally decreases. 3) the optimal cluster-period size generally decreases. 4) the total sample size generally decreases.

Appendix Tables 45 and Appendix Figures 34 show the corresponding decimal estimates for four of the trial designs: PA-LCRTs and multiple-period CRXO trials including closed-cohort and repeated cross-sectional designs. For a fixed T, the optimal CRXO trials have higher power than the optimal PA-LCRTs. The trial with the optimal closed-cohort CRXO trials reaches the highest power, followed by the optimal repeated cross-sectional CRXO trials. The optimal repeated cross-sectional trials have the largest total sample size.

For a fixed T, 1) as α0 increases, the power decreases, optimal number of clusters increases, optimal cluster-period size decreases, and total sample size decreases. 2) as α1 increases, the power decreases for optimal PA-LCRTs but increases for CRXO trials; the optimal number of clusters increases for PA-LCRTs but decreases for CRXO trials, whereas the optimal cluster-period size decreases for PA-LCRTs but increases for CRXO trials; the total sample size decreases for optimal PA-LCRTs but increases for optimal CRXO trials; 3) as α2 increases, the power decreases for optimal closed-cohort PA-LCRTs, whereas the power increases for closed-cohort CRXO trials; the optimal number of clusters decreases for closed-cohort PA-LCRT but increases for closed-cohort CRXO trials, whereas the optimal cluster-period size increases for closed-cohort PA-LCRTs but decreases for closed-cohort CRXO trials; the total sample size increases for closed-cohort PA-LCRTs but decreases for closed-cohort CRXO trials.

With the increase of T, 1) the power decrease for optimal closed-cohort designs but increase for optimal repeated cross-sectional designs. 2) optimal cluster-period size and number of clusters are decreasing for all four ODs. 3) the total sample size decrease for optimal closed-cohort designs but increase for optimal repeated cross-sectional designs.

Appendix Figure 1.

Appendix Figure 1

Appendix Figure 1

Optimal designs with the highest power to detect a treatment effect size of 0.2 and variance of σ2=1 for known association parameter (α0,α1,α2) assuming a budget B=300,000, cost per cluster c=3000, cost per participant s=200, and cost per time-period e=50 across six designs with T=4

Appendix Figure 2.

Appendix Figure 2

Optimal designs with the highest power to detect a treatment effect size of 0.2 and variance of σ2=1 for known association parameter (α0=0.05, α1=0.020, α2=0.6) assuming a budget B=300,000, cost per cluster c=3000, cost per participant s=250, and cost per time-period e=50 across six designs

Appendix Figure 3.

Appendix Figure 3

Appendix Figure 3

Optimal designs with the highest power to detect a treatment effect size of 0.2 and variance of σ2=1 for known association parameter (α0,α1,α2) assuming a budget B=300,000, cost per cluster c=3000, cost per participant s=200, and cost per time-period e=50 across four designs with T=4 (Decimal estimates)

Appendix Figure 4.

Appendix Figure 4

Optimal designs with the highest power to detect a treatment effect size of 0.2 and variance of σ2=1 for known association parameter (α0=0.05, α1=0.020, α2=0.6) assuming a budget B=300,000, cost per cluster c=3000, cost per participant s=200, and cost per time-period e=50 across six designs (Decimal estimates)

Table 1.

Optimal designs with the highest power for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs

Design Algorithm for integer estimates Formulae for decimal estimates
Common, unless otherwise specified Step 1. Specify the design space (2,nmax) and 2,mmax.
Step 2. For each combination of integer values of n and m in the design space,
  1) calculate the eigenvalues and check that the minimum of eigenvalues >0;
  2) calculate the total cost and check that it is less than or equal to a budget B;
  3) if conditions 1) and 2) are satisfied, calculate the variance.
Step 3. Select the design with the smallest variance in the design space.
Closed-cohort PA-LCRTs 1) eigenvalues λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2, λ4=1+n-1α0+T-1n-1α1+T-1α2;
2) total cost mc+sn+eTn;
3) variance (1.1).
nOD=ϑcs+eT;
mOD=Bϑs+eTc+c;
ϑ=1+T-1α2α0+T-1α1-1.
Repeated cross-sectional PA-LCRTs 1) eigenvalues λ1R=1-α0,λ3R=1+n-1α0-nα1, λ4R=1+n-1α0+T-1nα1;
2) total cost mc+s+eTn;
3) variance (1.2).
nOD=ϑcTs;
mOD=BϑTsc+c;
ϑ=1-α0α0+T-1α1.
Closed-cohort multiple-period CRXO trials 1) eigenvalues λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2,λ4=1+n-1α0+T-1n-1α1+T-1α2;
2) total cost mc+sn+eTn;
3) variance (2.1).

Ref: Liu et al.
nOD=ϑcs+eT;
mOD=Bϑs+eTc+c;
ϑ=1-α2α0-α1-1.

Ref: Liu et al.
Repeated cross-sectional multiple-period CRXO trials 1) eigenvalues λ1R=1-α0,λ3R=1+n-1α0-nα1, λ4R=1+n-1α0+T-1nα1;
2) total cost mc+s+eTn;
3) variance (2.2).

Ref: Liu et al.
nOD=ϑcTs;
mOD=BϑTsc+c;
ϑ=1-α0α0-α1.

Ref: Liu et al.
Closed-cohort SW-CRT Step 1. Specify the design space 2,nmax and (L,mmax).
Step 2.
  1) eigenvalues λ1=1-α0+α1-α2, λ2=1-α0-T-1α1-α2, λ3=1+n-1α0-α1-α2, λ4=1+n-1α0+T-1n-1α1+T-1α2;
  2) total cost mc+sn+eTn;
  3) variance (3.1).

Ref: Liu et al.
NA
Repeated cross-sectional SW-CRT Step 1. Specify the design space 2,nmax and (L,mmax).
Step 2.
  1) calculate the eigenvalues, λ1R,λ3R,λ4R and check that the minimum of eigenvalues minλ1R,λ3R,λ4R>0, where λ1R=1-α0,λ3R=1+n-1α0-nα1, λ4R=1+n-1α0+T-1nα1;
  2) total cost mc+s+eTn;
  3) variance (3.2).

Ref: Liu et al.
NA

α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences;n: cluster-period size (number of participants per cluster per period);nmax: maximum of cluster-period size; m: total number of clusters; mmax: maximum of number of clusters; c: cost per cluster; s: cost per participant; for closed-cohort LCRTs, e: cost per time-period; NA: not applicable.

Table 2.

Optimal designs with the highest power under a budget B=300,000 for six designs with T=4 for known association parameter α0,α1,α2

Association Parameter α0,α1,α2 PA-LCRTs Multiple-period CRXO trials SW-CRTs (L=3)
Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional
Power m n N Power m n N Power m n N Power m n N Power m n N Power m n N
(0.05, 0.020, 0.2) 0.723 40 11 440 0.599 30 7 840 0.980 38 12 456 0.773 20 12 960 0.655 33 15 495 0.390 30 7 840
(0.05, 0.020, 0.6) 0.569 38 12 456 0.999 48 8 384 0.796 45 9 405
(0.05, 0.020, 0.8) 0.513 36 13 468 >0.999 60 5 300 0.913 51 7 357
(0.05, 0.040, 0.2) 0.650 48 8 384 0.528 42 4 672 0.997 20 30 600 0.852 12 22 1056 0.742 27 20 540 0.407 15 17 1020
(0.05, 0.040, 0.6) 0.507 40 11 440 >0.999 28 19 532 0.908 33 15 495
(0.05, 0.040, 0.8) 0.460 38 12 456 >0.999 38 12 456 0.984 36 13 468

treatment effect=0.2, outcome variance=1; α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; cost per cluster c=3000; for closed-cohort LCRTs, cost per participant s=200; cost per time-period e=50; for repeated cross-sectional LCRTs, cost per participant s=250; m: total number of clusters; mmax=5000; n: cluster-period size (number of participants per cluster per period); nmax=5000;N: total sample size. only α0,α1 is needed for repeated cross-sectional trials.

Table 3.

Optimal designs with the highest power under a budget B=300,000 for six designs with α0=0.05, α1=0.020 and α2=0.6

T PA-LCRTs Multiple-period CRXO trials SW-CRTs (L=T1)
Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional
Power m n N Power m n N Power m n N Power m n N Power m n N Power m n N
4 0.569 38 12 456 0.599 30 7 840 >0.999 48 8 384 0.773 20 12 960 0.796 45 9 405 0.390 30 7 840
6 0.539 30 14 420 0.621 40 3 720 >0.999 40 9 360 0.811 20 8 960 0.922 40 9 360 0.471 25 6 900
8 0.495 26 14 364 0.613 32 3 768 >0.999 38 8 304 0.830 20 6 960 0.958 35 9 315 0.475 14 9 1008
10 0.469 30 10 300 0.619 28 3 840 >0.999 32 9 288 0.830 16 6 960 0.972 36 7 252 0.501 27 3 810
12 0.428 22 13 286 0.622 32 2 768 >0.999 38 6 228 0.850 20 4 960 0.982 33 7 231 0.521 33 2 792

treatment effect=0.2, outcome variance=1; α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; cost per cluster c=3000; for closed-cohort LCRTs, cost per participant s=200; cost per time-period e=50; for repeated cross-sectional LCRTs, cost per participant s=250; m: total number of clusters; mmax=5000; n: cluster-period size (number of participants per cluster per period); nmax=5000; N: total sample size. only α0,α1 is needed for repeated cross-sectional trials.

Table 4.

Optimal designs with the highest power under a budget B=300,000 for four designs with T=4 for known association parameter α0,α1,α2 (Decimal estimates)

Association Parameter α0,α1,α2 PA-LCRTs Multiple-period CRXO trials
Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional
Power m n N Power m n N Power m n N Power m n N
(0.05, 0.020, 0.2) 0.730 42.7 10.1 430.0 0.609 37.1 5.1 755.0 0.982 35.1 13.9 486.8 0.776 23.5 9.7 917.6
(0.05, 0.020, 0.6) 0.575 35.6 13.5 482.7 >0.999 43.8 9.6 421.4
(0.05, 0.020, 0.8) 0.521 33.4 15.0 499.7 >0.999 53.5 6.5 348.8
(0.05, 0.040, 0.2) 0.654 48.6 7.9 385.8 0.536 42.3 4.1 692.6 0.997 23.6 24.3 573.3 0.855 15.1 16.9 1018.9
(0.05, 0.040, 0.6) 0.512 41.0 10.8 442.1 >0.999 30.5 17.1 521.4
(0.05, 0.040, 0.8) 0.465 38.6 11.9 460.6 >0.999 38.6 11.9 460.6

treatment effect=0.2, outcome variance=1; α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; cost per cluster c=3000; for closed-cohort LCRTs, cost per participant s=200; cost per time-period e=50; for repeated cross-sectional LCRTs, cost per participant s=250; m: total number of clusters; mmax=5000; n: cluster-period size (number of participants per cluster per period); nmax=5000; N: total sample size. only α0,α1 is needed for repeated cross-sectional trials.

Table 5.

Optimal designs with the highest power under a budget B=300,000 for four designs with α0=0.05, α1=0.020 and α2=0.6(Decimal estimates)

T PA-LCRTs Multiple-period CRXO trials
Closed-cohort Repeated cross-sectional Closed-cohort Repeated cross-sectional
Power m n N Power m n N Power m n N Power m n N
4 0.575 35.6 13.5 482.7 0.609 37.1 5.1 755.0 >0.999 43.8 9.6 421.4 0.776 23.5 9.7 917.6
6 0.540 32.6 12.4 404.5 0.624 36.0 3.6 768.3 >0.999 41.1 8.6 353.5 0.811 20.1 8.0 959.0
8 0.503 30.3 11.5 348.3 0.632 35.4 2.7 775.3 >0.999 38.9 7.9 305.5 0.831 17.9 6.9 985.5
10 0.469 28.9 10.7 306.2 0.637 35.0 2.2 779.7 >0.999 37.1 7.3 269.6 0.845 16.3 6.2 1004.5
12 0.439 27.1 10.1 273.4 0.640 34.8 1.9 782.7 >0.999 35.5 6.8 241.7 0.855 15.1 5.6 1018.9

treatment effect=0.2, outcome variance=1; α0: within-period correlation; α1: inter-period correlation; α2: within-participant correlation; T: number of time-periods; L: number of treatment sequences; cost per cluster c=3000; for closed-cohort LCRTs, cost per participant s=200; cost per time-period e=50; for repeated cross-sectional LCRTs, cost per participant s=250; m: total number of clusters; mmax=5000; n: cluster-period size (number of participants per cluster per period); nmax=5000; N: total sample size. only α0,α1 is needed for repeated cross-sectional trials.

Footnotes

Conflict of Interest

The authors have declared no conflict of interest.

References

  • 1.James AS, Richardson V, Wang JS, Proctor EK, Colditz GA. Systems intervention to promote colon cancer screening in safety net settings: protocol for a community-based participatory randomized controlled trial. Implementation Science. 2013/06/03 2013;8(1):58. doi: 10.1186/1748-5908-8-58 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Brownson RC, Colditz GA, Proctor EK. Dissemination and Implementation Research in Health: Translating Science to Practice. Oxford University Press; 2018. [Google Scholar]
  • 3.Hemming K, Taljaard M, Moerbeek M, Forbes A. Contamination: How much can an individually randomized trial tolerate? Stat Med. Jun 30 2021;40(14):3329–3351. doi: 10.1002/sim.8958 [DOI] [PubMed] [Google Scholar]
  • 4.Faggiano F, Vigna-Taglianti F, Burkhart G, et al. The effectiveness of a school-based substance abuse prevention program: 18-Month follow-up of the EU-Dap cluster randomized controlled trial. Drug and Alcohol Dependence. 2010/04/01/ 2010;108(1):56–64. doi: 10.1016/j.drugalcdep.2009.11.018 [DOI] [PubMed] [Google Scholar]
  • 5.Jeyaratnam D, Whitty CJ, Phillips K, et al. Impact of rapid screening tests on acquisition of meticillin resistant Staphylococcus aureus: cluster randomised crossover trial. BMJ (Clinical research ed). Apr 26 2008;336(7650):927–30. doi: 10.1136/bmj.39525.579063.BE [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Arnup SJ, McKenzie JE, Hemming K, Pilcher D, Forbes AB. Understanding the cluster randomised crossover design: a graphical illustration of the components of variation and a sample size tutorial. Trials. 2017/08/15 2017;18(1):381. doi: 10.1186/s13063-017-2113-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Copas AJ, Lewis JJ, Thompson JA, Davey C, Baio G, Hargreaves JR. Designing a stepped wedge trial: three main designs, carry-over effects and randomisation approaches. Trials. 2015/08/17 2015;16(1):352. doi: 10.1186/s13063-015-0842-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Teerenstra S, Lu B, Preisser JS, van Achterberg T, Borm GF. Sample size considerations for GEE analyses of three-level cluster randomized trials. Biometrics. Dec 2010;66(4):1230–7. doi: 10.1111/j.1541-0420.2009.01374.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Wang J, Cao J, Zhang S, Ahn C. A flexible sample size solution for longitudinal and crossover cluster randomized trials with continuous outcomes. Contemporary Clinical Trials. 2021/10/01/ 2021;109:106543. doi: 10.1016/j.cct.2021.106543 [DOI] [PubMed] [Google Scholar]
  • 10.Giraudeau B, Ravaud P, Donner A. Sample size calculation for cluster randomized cross-over trials. Stat Med. Nov 29 2008;27(27):5578–85. doi: 10.1002/sim.3383 [DOI] [PubMed] [Google Scholar]
  • 11.Giraudeau B, Ravaud P, Donner A. Correction. Statistics in medicine. 02/01 2009;28:720. doi: 10.1002/sim.3486 [DOI] [Google Scholar]
  • 12.Rietbergen C, Moerbeek M. The Design of Cluster Randomized Crossover Trials. Journal of Educational and Behavioral Statistics. 2011;36(4):472–490. [Google Scholar]
  • 13.Li F, Forbes AB, Turner EL, Preisser JS. Power and sample size requirements for GEE analyses of cluster randomized crossover trials. Stat Med. Feb 20 2019;38(4):636–649. doi: 10.1002/sim.7995 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Liu J, Li F. Optimal designs using generalized estimating equations in cluster randomized crossover and stepped wedge trials. Stat Methods Med Res. May 30 2024:9622802241247717. doi: 10.1177/09622802241247717 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hemming K, Taljaard M. Sample size calculations for stepped wedge and cluster randomised trials: a unified approach. J Clin Epidemiol. Jan 2016;69:137–46. doi: 10.1016/j.jclinepi.2015.08.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Hooper R, Teerenstra S, de Hoop E, Eldridge S. Sample size calculation for stepped wedge and other longitudinal cluster randomised trials. Stat Med. Nov 20 2016;35(26):4718–4728. doi: 10.1002/sim.7028 [DOI] [PubMed] [Google Scholar]
  • 17.De Hoop E, Moerbeek M, Gerritsen D, Teerenstra S. Sample size estimation for cohort and cross-sectional cluster randomized stepped wedge designs. In: Oomen-de Hoop E, Efficient designs for cluster randomized trials with small numbers of clusters: stepped wedge and other repeated measurements designs (doctoral thesis). [Google Scholar]
  • 18.Hemming K, Lilford R, Girling AJ. Stepped-wedge cluster randomised controlled trials: a generic framework including parallel and multiple-level designs. Stat Med. Jan 30 2015;34(2):181–96. doi: 10.1002/sim.6325 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Li F, Turner EL, Preisser JS. Sample size determination for GEE analyses of stepped wedge cluster randomized trials. Biometrics. 2018;74(4):1450–1458. doi: 10.1111/biom.12918 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Teerenstra S, Taljaard M, Haenen A, et al. Sample size calculation for stepped-wedge cluster-randomized trials with more than two levels of clustering. Clinical trials (London, England). Jun 2019;16(3):225–236. doi: 10.1177/1740774519829053 [DOI] [PubMed] [Google Scholar]
  • 21.Kasza J, Hemming K, Hooper R, Matthews J, Forbes AB. Impact of non-uniform correlation structure on sample size and power in multiple-period cluster randomised trials. Stat Methods Med Res. Mar 2019;28(3):703–716. doi: 10.1177/0962280217734981 [DOI] [PubMed] [Google Scholar]
  • 22.Harrison LJ, Chen T, Wang R. Power calculation for cross-sectional stepped wedge cluster randomized trials with variable cluster sizes. Biometrics. Sep 2020;76(3):951–962. doi: 10.1111/biom.13164 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Li F. Design and analysis considerations for cohort stepped wedge cluster randomized trials with a decay correlation structure. Statistics in Medicine. 2020;39(4):438–455. doi: 10.1002/sim.8415 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Li F, Hughes JP, Hemming K, Taljaard M, Melnick ER, Heagerty PJ. Mixed-effects models for the design and analysis of stepped wedge cluster randomized trials: An overview. Stat Methods Med Res. Feb 2021;30(2):612–639. doi: 10.1177/0962280220932962 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ouyang Y, Li F, Preisser JS, Taljaard M. Sample size calculators for planning stepped-wedge cluster randomized trials: a review and comparison. International Journal of Epidemiology. 2022;51(6):2000–2013. doi: 10.1093/ije/dyac123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Raudenbush S Statistical analysis and optimal design for cluster randomized trials. Psychol Methods. 1997;2:173–185. [DOI] [PubMed] [Google Scholar]
  • 27.Raudenbush S, Liu X. Statistical power and optimal design for multisite trials. Psychol Methods. 2000;5(2):199–213. [DOI] [PubMed] [Google Scholar]
  • 28.Moerbeek M, Van Breukelen G, Berger M. Optimal experimental design for multilevel logistic models. The Statistician. 2001;50(1):17–30. [Google Scholar]
  • 29.Moerbeek M, Van Breukelen G, Berger M. Optimal experimental designs for multilevel models with covariates. Commun Stat Theory Methods. 2001;30(12):2683–2697. [Google Scholar]
  • 30.Connelly L Balancing the number and size of sites: an economic approach to the optimal design of cluster samples. Control Clin Trials. 2003;24:544–559. [DOI] [PubMed] [Google Scholar]
  • 31.Headrick T, Zumbo B. On optimizing multi-level designs: power under budget constraints. Austr N Z J Stat. 2005;47(2):219–229. [Google Scholar]
  • 32.Liu X Statistical power and optimum sample allocation ratio for treatment and control having unequal costs per unit of randomization. J Educ Behav Stat. 2003;28(3):231–248. [Google Scholar]
  • 33.Liu J, Liu L, James AS, Colditz GA. An overview of optimal designs under a given budget in cluster randomized trials with a binary outcome. Stat Methods Med Res. Jul 2023;32(7):1420–1441. doi: 10.1177/09622802231172026 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Grantham KL, Kasza J, Heritier S, Hemming K, Litton E, Forbes AB. How many times should a cluster randomized crossover trial cross over? Statistics in Medicine. 2019;38(25):5021–5033. doi: 10.1002/sim.8349 [DOI] [PubMed] [Google Scholar]
  • 35.Moerbeek M Optimal design of cluster randomized crossover trials with a continuous outcome: Optimal number of time periods and treatment switches under a fixed number of clusters or fixed budget. Behavior Research Methods. 2024/12/01 2024;56(8):8820–8830. doi: 10.3758/s13428-024-02505-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Liang K-Y, Zeger SL. Longitudinal Data Analysis Using Generalized Linear Models. Biometrika. 1986;73(1):13–22. doi: 10.2307/2336267 [DOI] [Google Scholar]
  • 37.Zhang Y, Preisser JS, Turner EL, Rathouz PJ, Toles M, Li F. A general method for calculating power for GEE analysis of complete and incomplete stepped wedge cluster randomized trials. Stat Methods Med Res. Jan 2023;32(1):71–87. doi: 10.1177/09622802221129861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Preisser JS, Young ML, Zaccaro DJ, Wolfson M. An integrated population-averaged approach to the design, analysis and sample size determination of cluster-unit trials. Stat Med. Apr 30 2003;22(8):1235–54. doi: 10.1002/sim.1379 [DOI] [PubMed] [Google Scholar]
  • 39.Liu A, Shih WJ, Gehan E. Sample size and power determination for clustered repeated measurements. Article. Statistics in Medicine. 2002;21(12):1787–1801. doi: 10.1002/sim.1154 [DOI] [PubMed] [Google Scholar]
  • 40.Wang X, Turner EL, Li F. Designing individually randomized group treatment trials with repeated outcome measurements using generalized estimating equations. Stat Med. Jan 30 2024;43(2):358–378. doi: 10.1002/sim.9966 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hemming K, Girling AJ, Sitch AJ, Marsh J, Lilford RJ. Sample size calculations for cluster randomised controlled trials with a fixed number of clusters. BMC Med Res Methodol. Jun 30 2011;11:102. doi: 10.1186/1471-2288-11-102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Moerbeek M, Van Breukelen G, Berger M. Design Issues for Experiments in Multilevel Populations. Journal of Educational and Behavioral Statistics. 2000;25(3):271–284. [Google Scholar]
  • 43.Van Breukelen G, Candel M. Efficient design of cluster randomized and multicentre trials with unknown intraclass correlation. Stat Methods Med Res. 2015;24(5):540–556. [DOI] [PubMed] [Google Scholar]
  • 44.Liu J, Colditz GA. Optimal design of longitudinal data analysis using generalized estimating equation models. Biometrical journal Biometrische Zeitschrift. Mar 2017;59(2):315–330. doi: 10.1002/bimj.201600107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Bruce ML, Ten Have TR, Reynolds CF, 3rd, et al. Reducing suicidal ideation and depressive symptoms in depressed older primary care patients: a randomized controlled trial. Jama. Mar 3 2004;291(9):1081–91. doi: 10.1001/jama.291.9.1081 [DOI] [PubMed] [Google Scholar]
  • 46.Chen J, Zhou X, Li F, Spiegelman D. swdpwr: A SAS macro and an R package for power calculations in stepped wedge cluster randomized trials. Comput Methods Programs Biomed. Jan 2022;213:106522. doi: 10.1016/j.cmpb.2021.106522 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Hooper R, Copas A. Stepped wedge trials with continuous recruitment require new ways of thinking. Journal of clinical epidemiology. 2019/12/01/ 2019;116:161–166. doi: 10.1016/j.jclinepi.2019.05.037 [DOI] [PubMed] [Google Scholar]
  • 48.Kasza J, Hooper R, Copas A, Forbes AB. Sample size and power calculations for open cohort longitudinal cluster randomized trials. Statistics in Medicine. 2020;39(13):1871–1883. doi: 10.1002/sim.8519 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Girling AJ, Hemming K. Statistical efficiency and optimal design for stepped cluster studies under linear mixed effects models. Stat Med. Jun 15 2016;35(13):2149–66. doi: 10.1002/sim.6850 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Zhan Z, de Bock GH, van den Heuvel ER. Statistical methods for unidirectional switch designs: Past, present, and future. Statistical Methods in Medical Research. 2018;27(9):2872–2882. doi: 10.1177/0962280216689280 [DOI] [PubMed] [Google Scholar]
  • 51.Zhang Y, Preisser JS, Li F, Turner EL, Toles M, Rathouz PJ. GEEMAEE: A SAS macro for the analysis of correlated outcomes based on GEE and finite-sample adjustments with application to cluster randomized trials. Comput Methods Programs Biomed. Mar 2023;230:107362. doi: 10.1016/j.cmpb.2023.107362 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Ouyang Y, Hemming K, Li F, Taljaard M. Estimating intra-cluster correlation coefficients for planning longitudinal cluster randomized trials: a tutorial. Int J Epidemiol. Oct 5 2023;52(5):1634–1647. doi: 10.1093/ije/dyad062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Korevaar E, Kasza J, Taljaard M, et al. Intra-cluster correlations from the CLustered OUtcome Dataset bank to inform the design of longitudinal cluster trials. Clinical trials (London, England). Oct 2021;18(5):529–540. doi: 10.1177/17407745211020852 [DOI] [PubMed] [Google Scholar]
  • 54.Maleyeff L, Li F, Haneuse S, Wang R. Assessing exposure-time treatment effect heterogeneity in stepped-wedge cluster randomized trials. Biometrics. Sep 2023;79(3):2551–2564. doi: 10.1111/biom.13803 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Kenny A, Voldal EC, Xia F, Heagerty PJ, Hughes JP. Analysis of stepped wedge cluster randomized trials in the presence of a time-varying treatment effect. Stat Med. Sep 30 2022;41(22):4311–4339. doi: 10.1002/sim.9511 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Wang B, Wang X, Li F. How to achieve model-robust inference in stepped wedge trials with model-based methods? Biometrics. Oct 3 2024;80(4)doi: 10.1093/biomtc/ujae123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Lawrie J, Carlin JB, Forbes AB. Optimal stepped wedge designs. Statistics & Probability Letters. 2015/04/01/ 2015;99:210–214. doi: 10.1016/j.spl.2015.01.024 [DOI] [Google Scholar]
  • 58.Li F, Turner EL, Preisser JS. Optimal allocation of clusters in cohort stepped wedge designs. Statistics & Probability Letters. 2018/06/01/ 2018;137:257–263. doi: 10.1016/j.spl.2018.02.002 [DOI] [Google Scholar]
  • 59.Moerbeek M Optimal allocation of clusters in stepped wedge designs with a decaying correlation structure. PloS one. 2023;18(8):e0289275. doi: 10.1371/journal.pone.0289275 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Watson SI, Girling A, Hemming K. Optimal study designs for cluster randomised trials: An overview of methods and results. Stat Methods Med Res. Nov 2023;32(11):2135–2157. doi: 10.1177/09622802231202379 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Liu J, Colditz GA. Relative efficiency of unequal versus equal cluster sizes in cluster randomized trials using generalized estimating equation models. Biometrical journal Biometrische Zeitschrift. May 2018;60(3):616–638. doi: 10.1002/bimj.201600262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Liu J, Colditz GA. Sample size calculation in three-level cluster randomized trials using generalized estimating equation models. Stat Med. Oct 30 2020;39(24):3347–3372. doi: 10.1002/sim.8670 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Liu J, Xiong C, Liu L, et al. Relative efficiency of equal versus unequal cluster sizes in cluster randomized trials with a small number of clusters. Journal of biopharmaceutical statistics. Mar 2021;31(2):191–206. doi: 10.1080/10543406.2020.1814795 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Figures-1
Supplementary Figures-2

RESOURCES