Selecting the optimal longitudinal cluster randomized design with a continuous outcome: parallel-arm, crossover or stepped-wedge

Jingxia Liu; Fan Li; Siobhan Sutcliffe; Graham A Colditz

doi:10.1177/09622802251360409

. Author manuscript; available in PMC: 2026 Jun 11.

Published in final edited form as: Stat Methods Med Res. 2025 Aug 11;34(10):2069–2090. doi: 10.1177/09622802251360409

Selecting the optimal longitudinal cluster randomized design with a continuous outcome: parallel-arm, crossover or stepped-wedge

Jingxia Liu ^a,^b, Fan Li ^c,^d, Siobhan Sutcliffe ^a, Graham A Colditz ^a

PMCID: PMC13251621 NIHMSID: NIHMS2171977 PMID: 40785501

Abstract

The optimal designs (ODs) for parallel-arm longitudinal cluster randomized trials (PA-LCRTs), multiple-period cluster randomized crossover (CRXO) trials and stepped wedge cluster randomized trials (SW-CRTs), including closed-cohort and repeat cross-sectional designs, have been studied separately under a cost-efficiency framework based on generalized estimating equations (GEEs). However, whether a global OD exists across longitudinal designs and randomization schedules remains unknown. Therefore, this research addresses a critical gap by comparing OD feature across complete LCRT designs with two treatment conditions and continuous outcomes. We define the OD as the design with either the lowest cost to obtain a desired level of power or the largest power given a fixed budget. For each of these ODs, we obtain the optimal number of clusters and the optimal cluster-period size (number of participants per cluster per period). To ensure equitable comparisons, we consider the GEE treatment effect estimator with the same block exchangeable correlation structure and develop OD algorithms with the lowest cost for each of six study designs. To obtain OD with the largest power, we summarize the previous and propose new OD algorithms and formulae. We suggest using the number of treatment sequences $L = T - 1,$ where $T$ is the number of time-periods, in both the optimal closed-cohort and repeated cross-sectional SW-CRTs to have the lowest cost. This is consistent with our previous findings for ODs with the largest power in SW-CRTs. Comparing all six ODs, we conclude that optimal closed-cohort CRXO trials are global ODs, yielding both the lowest cost and largest power.

Keywords: Block exchangeable correlation structure, cluster randomized crossover (CRXO), parallel-arm longitudinal cluster randomized trial (PA-LCRT), optimal design (OD) under a budgetary constraint, stepped wedge cluster randomized trials (SW-CRTs)

1. Introduction

Cluster randomized trials (CRTs) are commonly used designs in implementation science and pragmatic clinical and educational research.^1,2 These designs, which randomize participants at the cluster rather than the individual level, are often performed when randomization at the participant level would be infeasible and/or would lead to contamination of the intervention and biased estimation of the intervention effect.³ Therefore, CRTs are increasingly used to evaluate the real-world impact of health and educational interventions at higher institutional levels, such as hospitals, medical practices, and schools.

1.1. Overview of longitudinal CRT designs

Three main longitudinal CRTs (LCRTs) will be considered in this article: parallel-arm longitudinal cluster randomized trials (PA-LCRTs), cluster randomized crossover (CRXO) trials, and stepped wedge cluster randomized trials (SW-CRTs). Each of these designs randomizes participants at the cluster level (e.g. medical center, clinic, ward, or classroom). However, they vary in terms of which clusters receive the intervention and when during the trial (i.e. time-period) they receive this intervention. Table 1 provides example schematics for three possible longitudinal cluster randomized designs with 20 clusters.

Table 1.

Example schematics for three possible longitudinal cluster randomized designs with $m = 20$ clusters

(a) An example of a PA-LCRT intervention schedule
Cluster	Treatment sequence	Time Periods
Cluster	Treatment sequence	Period 1	Period 2	Period 3	Period 4	Period 5	Period 6
1,⋯,10	1	Intervention	Intervention	Intervention	Intervention	Intervention	Intervention
11,⋯,20	2	Control	Control	Control	Control	Control	Control
(b) An example of a CRXO trial intervention schedule
Cluster	Treatment sequence	Time Periods
Cluster	Treatment sequence	Period 1	Period 2	Period 3	Period 4	Period 5	Period 6
1,⋯,10	1	Intervention	Control	Intervention	Control	Intervention	Control
11,⋯,20	2	Control	Intervention	Control	Intervention	Control	Intervention
(c) An example of a SW-CRT intervention schedule
Cluster	Treatment sequence	Time Periods
Cluster	Treatment sequence	Period 1 (Baseline)	Period 2 (Step 1)	Period 3 (Step 2)	Period 4 (Step 3)	Period 5 (Step 4)	Period 6 (Extended period)
1,⋯,5	1	Control	Intervention	Intervention	Intervention	Intervention	Intervention
6,⋯,10	2	Control	Control	Intervention	Intervention	Intervention	Intervention
11, ⋯,15	3	Control	Control	Control	Intervention	Intervention	Intervention
16,⋯,20	4	Control	Control	Control	Control	Intervention	Intervention

Open in a new tab

The first LCRT that we consider, PA-LCRTs, randomizes clusters to one of two treatment conditions. All participants within a given cluster receive the same treatment assignment and participants remain in their assigned treatment condition for the duration of the trial. In a typical PA-LCRT, participants within a cluster are often measured at multiple time points. For example, Faggiano et al. investigated the effect of a school-based substance abuse prevention program in which schools were randomly assigned to one of four experimental groups.⁴ The behavioral endpoints from each student were collected repeatedly at the baseline, 6-month and 18-month assessments. In this type of design, measurements across different time points are correlated within a participant (if a closed-cohort is considered), and measurements across participants at the same time point are correlated within a cluster.

Similar to PA-LCRTs, cluster randomized crossover (CRXO) trials randomize clusters, but unlike PA-LCRTs, to a sequence of treatment conditions rather than to one treatment condition. Within each cluster, participants receive the same treatment condition in each time-period. For example, for two-period CRXO trials, all enrolled clusters are randomized to the assignments of either IC (intervention in the first time-period, followed by control in the second time-period) or CI (control in the first time-period, followed by intervention in the second time-period). CRXO trials may require a washout period between two consecutive time-periods to minimize the carryover effect. For example, Jeyaratnam et al. conducted a two-period CRXO trial to determine whether a rapid screening test leads to a reduction in methicillin resistant Staphylococcus aureus (MRSA) acquisition on hospital general wards.⁵ They randomized 10 wards to receive either rapid screening for MRSA or conventional culture screening. This study included a three-month baseline period, a five-month first intervention period, a one-month washout period, and a five-month second intervention period. By leveraging both within- and between-cluster comparisons, CRXO trials are resource efficient and could be statistically more powerful than other designs.

The third major LCRT that we consider, stepped wedge cluster randomized trials (SW-CRTs), is similar to CRXO trials, except that it staggers treatment assignments over time. All participants within a cluster receive the same treatment in each time-period, but not all clusters experience each treatment for the same amount of time. In a typical SW-CRT, 1) all clusters start from the control condition at time-period 1 (the baseline period); 2) at each subsequent time-period, a subset of clusters is randomized to receive the intervention condition and to maintain the intervention status until the end of the study; and 3) at the end of the study, all clusters receive the intervention. SW-CRTs are particularly attractive to stakeholders when they perceive the intervention to be beneficial and when implementation of the intervention is logistically more feasible for a smaller fraction of enrolled clusters at an earlier time-period.

For all three types of LCRTs, we will consider two different sampling designs in our development: closed-cohort and repeated cross-sectional designs. In closed-cohort designs, the same participants are followed across time-periods and the outcome is measured in each time-period, whereas in repeated cross-sectional designs, different participants are enrolled in each time-period, and the outcome is collected within each of these time-periods.^6,7

1.2. Variance and sample size considerations for LCRTs

Given their cluster designs, investigators must consider the following three correlations in sample size calculations and statistical analyses of LCRTs: “within-period,” “inter-period,” and “within-participant,” correlations. The “within-period correlation” measures the similarity in outcomes between two participants in the same cluster and time-period. The “inter-period correlation” measures the similarity in outcomes between two participants in the same cluster but from different time-periods; and the “within-participant correlation” measures the similarity in outcomes between the same participant across time-periods and is typically only needed under a closed-cohort design.

Taking these correlations into account, previous studies have derived variance formulae for: 1) PA-LCRTs, using generalized estimating equations (GEEs) with at least two correlations parameters;^8,9 and 2) $T$ -period closed-cohort and repeated cross-sectional CRXO trials, assuming that $T$ is a multiple of 2 and that treatment assignments repeat the same assignment pattern after the first two periods (e.g., intervention, control, intervention, control, …, etc.).^10–14 Investigators have also derived sample size formulae for closed-cohort and repeated cross-sectional SW-CRTs with two treatment conditions.^15–25

In addition to variance and sample size formulae, researchers have proposed optimal sample sizes for CRTs, including the optimal number of clusters and cluster-period size (i.e. number of participants per cluster per period) as a function of cost and intracluster correlation coefficients for a given budget.^26–32 When correlation parameters are known, optimization provides the sample size that generates the minimum variance of the estimated treatment effect (ETE). The optimal designs (ODs) for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs including closed-cohort and repeat cross-sectional designs have also been studied separately using GEEs under a cost-efficiency framework.^14,33 Specifically for multiple-period CRXO trials and assuming linear mixed models, Grantham et al. studied the optimal number of crossovers for maximizing efficiency and cost-efficiency,³⁴ whereas Moerbeek et al. studied the optimal number of time-periods when a treatment switches at the end of each time-period, and the optimal number of treatment switches with a fixed number of time-periods.³⁵ However, whether global ODs exist across study designs and randomization schedules remains unknown. Therefore, this research addresses a critical gap by comparing OD features across each of the following six designs: PA-LCRTs, multiple-period CRXO trials, and SW-CRTs with two different sampling designs (repeated cross-sectional or closed-cohort designs) each. Importantly, our work also differs from previous investigations comparing different LCRTs that assumed a fixed sample size without optimization and budget restrictions.^15,21 For example, Hemming and Taljaard found that, given the same total sample size, SW-CRTs can be either more or less efficient than PA-LCRTs, depending on the value of the assumed correlation parameter.¹⁵ Such observations may or may not be generalizable to the comparison between optimal SW-CRT and optimal PA-LCRT designs, because the OD itself is a nonlinear function of the study budget and correlation parameters.

To ensure equitable comparisons across designs, we consider the same statistical model for all three LCRTs—a marginal model with constant intervention effect and categorical period effects estimated by GEEs where the treatment effect parameter carries a population-averaged interpretation.³⁶ For PA-LCRTs, the treatment assignment for clusters is either I (intervention) or C (control), and $π$ is the proportion of clusters receiving the I assignment. For CRXO trials, the treatment sequence assignment has repeating patterns of either IC or CI (e.g., ICICIC or CICICI for $T = 6$ ) and $π$ is the proportion of clusters receiving the IC treatment sequence. Finally, for SW-CRTs, the treatment assignment is also described by $L$ , the number of treatment sequences $.$ For example, for $T = 6, L$ could either be 3, 4, or 5. When $L = 3$ , clusters are allocated in one of the following treatment sequences: CIIIII, CCIIII, and CCCIII; when $L = 4$ , possible sequences are: CIIIII, CCIIII, CCCIII, and CCCCII; and when $L = 5$ , sequences are: CIIIII, CCIIII, CCCIII, CCCCII, and CCCCCI. In other words, SW-CRTs with $L = 3 or 4$ including additional periods where all clusters are in the intervention condition (sometimes referred to as a maintenance phase).³⁷ Thus, for the purpose of our evaluation, we first consider how to choose the number of treatment sequences $L$ in SW-CRTs. In addition, this research mainly lies optimizing the sample size configuration for balanced assignment to each sequence, allowing for a maintenance phase of the trial. Then, using this suggested $L$ , we compare different LCRT designs to address the question “which design requires the lowest cost to obtain a desired level of power or has the largest power under a fixed budget”. Addressing this research question can help investigators decide the merit of each design option and potentially maximize cost efficiency at the study planning stage.

This article is organized as follows. In Section 2, we provide an overview of ETE variance formulae from GEE analyses for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs with two treatment conditions and continuous outcomes. Section 3 proposes the OD with the lowest cost at a desired level of power for each study design and then compares ODs across six study designs. These include closed-cohort and repeated cross-sectional PA-LCRTs, multiple-period CRXO trials, and SW-CRTs. In Section 4, we illustrate our proposed ODs through a real example. Finally, in Section 5 we discuss our findings and offer suggestions for future research. Data sharing is not applicable to this article as no new data are created or analyzed in this study.

2. Statistical models and GEEs in LCRTs: A review

Let $Y_{i j t}$ be a continuous outcome from participant $j = 1, \dots, n$ in cluster $i = 1, \dots, m$ and time-period $t = 1, \dots, T$ . The marginal model is $μ_{i j t} = δ_{t} + X_{i t} β,$ where $μ_{i j t}$ is a marginal mean, $δ_{t}$ is the $t t h$ time-period effect, $X_{i t}$ is the treatment indicator of cluster $i$ in time-period $t$ (=1 if receiving intervention, 0 otherwise), and $β$ is the treatment effect of interest. Note that $β$ is assumed to be constant over time periods; $δ_{t} + β$ represents the intervention effect while $δ_{t}$ represents the control effect at the $t t h$ time period. The number of participants (or observations) per cluster per period $n$ is generally referred to as the cluster-period size. Here we assume that every participant contributes complete measurements (i.e., no missing outcomes). For a continuous outcome with mean 0 and variance $σ^{2}$ , the hypotheses of interest are $H_{0} : β = 0$ versus $H_{1} : β = β^{*}$ (an assumed value for effect size ).

To estimate the ETE variance, Preisser et al. suggested using the following block exchangeable correlation structure for LCRTs:³⁸

a constant within-period correlation (between outcomes from different participants within the same cluster $i$ during the same time-period $t$ ), $C o r r (Y_{i j_{1} t}, Y_{i j_{2} t}) = α_{0}$ for $j_{1} \neq j_{2}$ ;
a constant inter-period correlation (outcomes from different participants within the same cluster $i$ but across different time-periods), $C o r r (Y_{i j_{1} t_{1}}, Y_{i j_{2} t_{2}}) = α_{1}$ for $j_{1} \neq j_{2},$ and any $t_{1}, t_{2}$ ;
a constant within-participant correlation (outcomes from the same participant $j$ across different time-periods), $C o r r (Y_{i j t_{1}}, Y_{i j t_{2}}) = α_{2}$ for $t_{1} \neq t_{2}$ ;

Li et al. mentioned that the ETE variance is well-defined under the eigenvalue constraint, $min \{λ_{1}, λ_{2}, λ_{3}, λ_{4}\} > 0$ , where the distinct eigenvalues of the block exchangeable correlation structure are $λ_{1} = 1 - α_{0} + α_{1} - α_{2}$ , $λ_{2} = 1 - α_{0} - (T - 1) (α_{1} - α_{2})$ , $λ_{3} = 1 + (n - 1) (α_{0} - α_{1}) - α_{2}$ , $λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2}$ . ¹⁹ When $α_{1} < m i n (α_{0}, α_{2}),$ we have $λ_{2} > λ_{1}$ and $λ_{3} > λ_{1}$ . As such, if $λ_{1} > 0$ , then the constraint, $min \{λ_{1}, λ_{2}, λ_{3}, λ_{4}\} > 0$ is satisfied given that $λ_{4} > λ_{3}$ . Below, we present an overview of the ETE variances from GEE analyses for individual LCRT designs.

2.1. Closed-cohort PA-LCRTs

Let $π$ , the proportion of clusters receiving the intervention assignment, be a pre-determined value, e.g., 50%. The number of clusters in the intervention and control conditions are, $m_{i n t v} = m π$ and $m_{c o n t} = m (1 - π),$ respectively. Table 1(a) illustrates a PA-LCRT design with $m = 20, T = 6$ and $π = 50 %$ . To estimate the ETE variance for closed-cohort PA-LCRTs (as well as multiple-period CRXO trials; see below), Wang et al. assumed a more general correlation structure than Preisser et al. as follows:⁹

a constant within-period correlation, $C o r r (Y_{i j_{1} t}, Y_{i j_{2} t}) = α_{0}$ for $j_{1} \neq j_{2}$ ;
an inter-period correlation matrix, denoted by
$ϕ = (\begin{matrix} ϕ_{11} & \dots & ϕ_{1 T} \\ ⋮ & ⋱ & ⋮ \\ ϕ_{T 1} & \dots & ϕ_{T T} \end{matrix}),$
where $ϕ_{t t} = α_{0}$ , $ϕ_{t_{1} t_{2}} = C o r r (Y_{i j_{1} t_{1}}, Y_{i j_{2} t_{2}})$ for $j_{1} \neq j_{2},$ and any $t_{1} \neq t_{2};$ and
a within-participant correlation matrix, denoted by
$Ω = (\begin{matrix} 1 & \dots & ω_{1 T} \\ ⋮ & ⋱ & ⋮ \\ ω_{T 1} & \dots & 1 \end{matrix}),$
where $ω_{t_{1} t_{2}} = C o r r (Y_{i j t_{1}}, Y_{i j t_{2}})$ for $t_{1} \neq t_{2}$ .

Using these assumptions and notation, the correlation structure is fully specified by $ϕ$ and $Ω$ . The correlation matrix for the vector of outcomes within each cluster is given by

R_{i} = I_{n \times n} \otimes (Ω - ϕ) + 1_{n \times n} \otimes ϕ,

where ‘ $\otimes$ ’ denotes the Kronecker product. For closed-cohort PA-LCRTs, Wang et al. derived the ETE variance as⁹

V a r (β^{*}) = \frac{σ^{2} \sum_{t_{1} = 1}^{T} \sum_{t_{2} = 1}^{T} (ω_{t_{1} t_{2}} + (n - 1) ϕ_{t_{1} t_{2}})}{T^{2} n m π (1 - π)} .

When both $ϕ$ and $Ω$ have a compound symmetry structure- for example, $ϕ_{t_{1} t_{2}} = α_{1}, ω_{t_{1} t_{2}} = α_{2}$ for $t_{1} \neq t_{2}$ - this leads to a block exchangeable correlation structure and the above variance of the ETE simplifies to

V a r (β^{*}) = \frac{σ^{2} λ_{4}}{T m n π (1 - π)},

(1.1)

where $λ_{4}$ is the leading eigenvalue of the block exchangeable correlation structure. We note that Equation (1.1) is identical to the variance in the sample size formula (Eq. (27)) in Liu et al.³⁹ and becomes a special case of the sample size formula (Eq. in Section 3.3.3) in Wang et al.⁴⁰ Teerenstra et al. have also proposed a nested exchangeable correlation structure with two constant correlations of $α_{1}$ and $α_{2}$ ⁸ for a PA-LCRT, which simplifies a block exchangeable correlation structure when $α_{0} = α_{1}$ .

2.2. Closed-cohort multiple-period CRXO trials

Table 1(b) demonstrates a CRXO trial design with $m = 20, T = 6$ and $π = 50 %$ . We assume no carryover effect or a washout period to minimize the carryover effect. Using a block exchangeable correlation structure, Liu et al. obtained the ETE variance for a $T$ -period CRXO trial as

V a r (β^{*}) = \frac{σ^{2} λ_{3}}{T m n π (1 - π)} .

(2.1)

The proof was provided by Liu et al in their Appendix.¹⁴ When both $ϕ$ and $Ω$ from Section 2.1 have a compound symmetry structure, the ETE variance from Wang et al.⁹ is the same as Equation (2.1) for an even value of $T$ .

2.3. Closed-cohort SW-CRTs

To estimate the ETE variance for SW-CRTs, an additional parameter is required: the number of steps. A step is defined as the time period when at least one cluster crosses over from control to intervention. The total number of steps is denoted by the number of treatment sequences $L$ and each cluster $i$ is allocated to a specific treatment sequence $l = 1, \dots, L$ . Of note, the number of treatment sequences cannot be equal to or larger than the number of time-periods $(2 \leq L \leq T - 1)$ ; otherwise, some clusters will not receive the intervention. The number of clusters that cross over at each step is summarized by $h_{l}$ such that $\sum_{l = 1}^{L} h_{l} = m$ . Table 1(c) demonstrates a study design with $m = 20, T = 6, L = 4, a n d h_{1} = h_{2} = h_{3} = h_{4} = 5$ .

Under the following assumptions:

an equal number of clusters crossing over to intervention at each time-period $h_{s} \equiv h$ ; and
$T \geq L + 1$ ,

Liu et al.¹⁴ derived the approximate variance of ETE for closed-cohort SW-CRTs as

\frac{4 σ^{2}}{m n} \times \frac{3}{2 (L - \frac{1}{L})} \{\frac{T λ_{3} λ_{4}}{\frac{L λ_{3}}{2} + (T - \frac{L}{2}) λ_{4}}\} .

(3.1)

Of note, the denominator $(L - \frac{1}{L}) [\frac{L λ_{3}}{2} + (T - \frac{L}{2}) λ_{4}] \propto \frac{λ_{3} - λ_{4}}{2} {(L - \frac{T}{1 - \frac{λ_{3}}{λ_{4}}})}^{2} - \frac{T λ_{4}}{L}$ is not a monotonic function of $L$ given that $λ_{3} < λ_{4}$ .

2.4. Repeated cross-sectional LCRTs

Different participants are enrolled at each time-period in repeated cross-sectional trials, in which the within-participant correlation $α_{2}$ is no longer needed in the within-cluster correlation structure, therefore, the ETE variances of repeated cross-sectional longitudinal trials can be obtained by setting $α_{2} = α_{1}$ in the ETE variance formulae for closed-cohort PA-LCRTs, CRXO trials and SW-CRTs.¹⁹ For ease of reference, Table 2 summarizes all ETE variances for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs from Section 2.

Table 2.

ETE Variances from GEE analyses for PA-LCRTs, multiple-period CRXO trials, and SW-CRTs

Design	Var(β*)	Note
Closed-cohort PA-LCRTs	$\frac{σ^{2} λ_{4}}{T m n π (1 - π)} (1.1)$ Ref: Wang et al.	$λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2};$ $π :$ the proportion of clusters receiving the assignment of intervention.
Repeated cross-sectional PA-LCRTs	$\frac{σ^{2} λ_{4 R}}{T m n π (1 - π)} (1.2)$	$λ_{4 R} = 1 + (n - 1) α_{0} + (T - 1) n α_{1} .$
Closed-cohort multiple-period CRXO trials	$\frac{σ^{2} λ_{3}}{T m n π (1 - π)} (2.1)$ Ref: Liu et al.	$λ_{3} = 1 + (n - 1) (α_{0} - α_{1}) - α_{2};$ $π :$ the proportion of clusters receiving the assignment of IC in the first two periods.
Repeated cross-sectional multiple-period CRXO trials	$\frac{σ^{2} λ_{3 R}}{T m n π (1 - π)} (2.2)$ Ref: Liu et al.	$λ_{3 R} = 1 + (n - 1) α_{0} - n α_{1} .$
Closed-cohort SW-CRT	$\frac{4 σ^{2}}{m n} \times \frac{3}{2 (L - \frac{1}{L})} \{\frac{T λ_{3} λ_{4}}{\frac{L λ_{3}}{2} + (T - \frac{L}{2}) λ_{4}}\} (3.1)$ Ref: Li et al.	$λ_{3} = 1 + (n - 1) (α_{0} - α_{1}) - α_{2};$ $λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2} .$
Repeated cross-sectional SW-CRT	$\frac{σ^{2}}{m n} \times \frac{3}{2 (L - \frac{1}{L})} \{\frac{T λ_{3 R} λ_{4 R}}{\frac{L λ_{3 R}}{2} + (T - \frac{L}{2}) λ_{4 R}}\}$ $(3.2)$ Ref: Li et al.	$λ_{3 R} = 1 + (n - 1) α_{0} - n α_{1};$ $λ_{4 R} = 1 + (n - 1) α_{0} + (T - 1) n α_{1} .$

Open in a new tab

$\hat{β} :$ ETE; $σ^{2} :$ outcome variance; $α_{0} :$ within-period correlation; $α_{1} :$ inter-period correlation; $α_{2} :$ within-participant correlation; $T :$ number of time-periods; $L :$ number of treatment sequences; $m :$ total number of clusters; $n :$ cluster-period size (number of participants per cluster per period).

3. ODs with the lowest cost

To derive the ODs, we assume that the association parameters in the correlation structure are known or can be estimated based on routinely collected data. Methods and examples for obtaining intraclass correlation coefficient (ICC) estimates in LCRTs have been discussed in detail elsewhere.¹⁴ Based on marginal models estimated by GEE, previous researchers have shown that $\frac{\hat{β}}{\sqrt{V a r (\hat{β})}}$ is approximately normally distributed in CRTs when the number of clusters is sufficiently large.^8,9,19 It is noteworthy to mention that asymptotic normality may not work when the number of clusters is small. Using the ETE variance, the required sample size can be calculated to meet a certain power requirement, e.g., 80%, at a specific type I error rate, e.g., 5%. We will assume a type I error of 5% in all remaining examples. If the cluster-period size $n$ is known, then the number of clusters $m$ can be calculated based on the ETE variance for a pre-determined number of time-periods $T$ . Similarly, when the number of clusters $m$ is known, the cluster-period size $n$ can be similarly calculated. Of note, Hemming et al. showed that the desired power may not be achieved in the latter scenario.⁴¹

Following the cost-efficiency framework in Liu and Li,¹⁴ we assume that the cost per cluster recruitment is $c$ currency units (e.g., $US), the cost per participant enrollment is $s$ currency units, and the cost per outcome measurement is $e$ currency units. Thus, the total cost is calculated as $T C = m (c + s n + e T n)$ in closed-cohort trials and $T C = m [c + (s + e) T n]$ in repeated cross-sectional trials.

We refer to the enrollment feasibility of a cluster-period size, e.g., $(2, n_{m a x})$ and number of clusters, e.g., $(2, m_{m a x})$ , as the design space. We define the OD as the design within the design space with:

the lowest cost in the design space that obtains a desired level of power, e.g., 80%; or
the lowest ETE variance, equivalent to the highest power, in the design space given a fixed budget.^26,42–44

In the next sub-sections, we present a general algorithm for identifying the OD with the lowest cost per trial design. We also determine the optimal number of treatment sequences $L$ in SW-CRTs, and then we use this value to identify the OD for all six study designs of interest.

3.1. OD algorithms with the lowest cost for LCRTs

First, the OD algorithm for closed-cohort PA-LCRTs is proposed as follows:

Step 1. Specify desired levels of type I error and power, and the design space $(2, n_{m a x})$ and $(2, m_{m a x})$ .

Step 2. For each integer value of $n$ in the design space,

calculate eigenvalues, $λ_{1}, λ_{2}, λ_{3}, λ_{4}$ and check the minimum $min \{λ_{1}, λ_{2}, λ_{3}, λ_{4}\} > 0,$ where $λ_{1} = 1 - α_{0} + α_{1} - α_{2}$ , $λ_{2} = 1 - α_{0} - (T - 1) (α_{1} - α_{2})$ , $λ_{3} = 1 + (n - 1) (α_{0} - α_{1}) - α_{2}$ , $λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2};$
calculate the smallest $m$ to obtain the desired level of power at the given type I error using the variance (1.1) and check that $m \leq m_{m a x};$
if conditions 1) and 2) are satisfied, calculate the total cost as $m (c + s n + e T n)$ .

Step 3. Select the design with the smallest total cost in the design space, $\{n_{O D}, m_{O D}\}$ .

Similarly, we propose OD algorithms with the lowest cost for multiple-period CRXO trials and SW-CRTs. Algorithms for the six LCRT designs are detailed in Table 3. Of note, for closed-cohort CRTs the cluster size is the same as the cluster-period size $n$ ; whereas for repeated cross-sectional CRTs, the actual cluster size over all periods is $n \times T$ , as different participants are included in each distinct time period. Therefore, for closed-cohort CRTs, the total sample size (distinct number of participants) can be considered as $N = m \times n$ ; whereas for repeated cross-sectional CRTs, it is $N = m \times n \times T$ .

Table 3.

Optimal designs with the lowest cost for PA-LCRTs, multiple-period CRXO trials and SW-CRTs

Design	Algorithm for integer estimates
Common, unless otherwise specified	Step 1. Specify desired levels of type I error and power, and the design space $(2, n_{m a x})$ and $(2, m_{m a x}) .$ Step 2. For each integer value of $n$ in the design space, 1) calculate eigenvalues and check that the minimum of eigenvalues $> 0;$ 2) calculate the smallest $m$ to obtain the desired level of power at the given type I error using the variance and check that $m \leq m_{m a x};$ 3) if conditions 1) and 2) are satisfied, calculate the total cost. Step 3. Select the design with the smallest total cost in the design space.
Closed-cohort PA-LCRTs	1) eigenvalues $λ_{1} = 1 - α_{0} + α_{1} - α_{2}$ , $λ_{2} = 1 - α_{0} - (T - 1) (α_{1} - α_{2})$ , $λ_{3} = 1 + (n - 1) (α_{0} - α_{1}) - α_{2}$ , $λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2};$ 2) variance (1.1); 3) total cost $m (c + s n + e T n)$
Repeated cross-sectional PA-LCRTs	1) eigenvalues $λ_{1 R} = 1 - α_{0}$ , $λ_{3 R} = 1 + (n - 1) α_{0} - n α_{1}$ , $λ_{4 R} = 1 + (n - 1) α_{0} + (T - 1) n α_{1};$ 2) variance (1.2); 3) total cost $m [c + (s + e) T n]$
Closed-cohort multiple-period CRXO trials	1) eigenvalues $λ_{1} = 1 - α_{0} + α_{1} - α_{2}$ , $λ_{2} = 1 - α_{0} - (T - 1) (α_{1} - α_{2})$ , $λ_{3} = 1 + (n - 1) (α_{0} - α_{1}) - α_{2}$ , $λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2};$ 2) variance (2.1); 3) total cost $m (c + s n + e T n)$
Repeated cross-sectional multiple-period CRXO trials	1) eigenvalues $λ_{1 R} = 1 - α_{0}$ , $λ_{3 R} = 1 + (n - 1) α_{0} - n α_{1}$ , $λ_{4 R} = 1 + (n - 1) α_{0} + (T - 1) n α_{1};$ 2) variance (2.2); 3) total cost $m [c + (s + e) T n]$
Closed-cohort SW-CRT	Step 1. the design space $(2, n_{m a x})$ and $(L, m_{m a x}) .$ Step 2. 1) eigenvalues $λ_{1} = 1 - α_{0} + α_{1} - α_{2}$ , $λ_{2} = 1 - α_{0} - (T - 1) (α_{1} - α_{2})$ , $λ_{3} = 1 + (n - 1) (α_{0} - α_{1}) - α_{2}$ , $λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2};$ 2) variance (3.1); 3) total cost $m (c + s n + e T n)$
Repeated cross-sectional SW-CRT	Step 1. the design space $(2, n_{m a x})$ and $(L, m_{m a x}) .$ Step 2. 1) eigenvalues $λ_{1 R} = 1 - α_{0}, λ_{3 R} = 1 + (n - 1) α_{0} - n α_{1}$ , $λ_{4 R} = 1 + (n - 1) α_{0} + (T - 1) n α_{1};$ 2) variance (3.2); 3) total cost $m [c + (s + e) T n]$

Open in a new tab

$α_{0} :$ within-period correlation; $α_{1} :$ inter-period correlation; $α_{2} :$ within-participant correlation; $T :$ number of time-periods; $L :$ number of treatment sequences; $n :$ cluster-period size (number of participant s per cluster per period); $n_{m a x} :$ maximum of cluster-period size; $m :$ total number of clusters; $m_{m a x} :$ maximum of number of clusters; $c :$ cost per cluster; $s :$ cost per participant; for closed-cohort LCRTs, $e :$ cost per time-period.

3.2. Choosing the number of treatment sequences in SW-CRTs

To ensure equitable comparisons across study designs, we next determine the optimal number of treatment sequences $L$ in SW-CRTs. For a pre-determined value of $T, L$ can be any integer between 2 and $T - 1$ , in theory (allowing for a different number of clusters to be randomized to each unique sequence). We start by examining ODs for values of $T = 4$ . Table 4 details the calculated total cost ${T C}_{O D}$ , number of clusters $m_{O D}$ , cluster-period size $n_{O D}$ , and total sample size $N_{O D}$ under the OD for known association parameters $(α_{0}, α_{1}, α_{2})$ in closed-cohort and repeated cross-sectional SW-CRTs, assuming a type I error of 5%, a power of 80%, a treatment effect of $β = 0.2, σ^{2} = 1, c = 3000, s = 200, a n d e = 50$ . For example, for a closed-cohort SW-CRT with $(α_{0}, α_{1}, α_{2}) = (0.05, 0.020, 0.2)$ and $L = 2$ , the OD has a total cost of 684.0k, a total number of clusters of 76, a cluster-period size of 15, and a total sample size of 1140, whereas for a similar trial with $L = 3,$ these values are 418.2k, 51, 13, and 663, respectively. As a further example, for a repeated cross-sectional SW-CRT with $(α_{0}, α_{1}) = (0.05, 0.020)$ and $L = 2$ , the OD has a total cost of 1408.0k, a total number of clusters of 128, a cluster-period size of 8, and a total sample size of 4096, whereas for a similar trial with $L = 3,$ these values are 840.0k, 84, 7, and 2352, respectively.

Table 4.

Optimal designs with the lowest cost to obtain 80% power for SW-CRTs with $T = 4$ for known association parameter $(α_{0}, α_{1}, α_{2})$

Association Parameter $(α_{0}, α_{1}, α_{2})$	Closed-cohort								Repeated cross-sectional
	$L = 2$				$L = 3$				$L = 2$				$L = 3$
	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$
(0.05, 0.020, 0.2)	684.0	76	15	1140	418.2	51	13	663	1408.0	128	8	4096	840.0	84	7	2352
(0.05, 0.020, 0.6)	471.2	76	8	608	304.2	39	12	468
(0.05, 0.020, 0.8)	322.4	52	8	416	217.8	33	9	297
(0.05, 0.040, 0.2)	536.0	40	26	1040	349.8	33	19	627	1254.0	66	16	4224	780.0	60	10	2400
(0.05, 0.040, 0.6)	330.0	30	20	600	216.0	24	15	360
(0.05, 0.040, 0.8)	207.2	28	11	308	140.4	18	12	216
(0.10, 0.020, 0.2)	950.4	144	9	1296	574.2	87	9	783	1744.0	218	5	4360	1008.0	126	5	2520
(0.10, 0.020, 0.6)	702.0	130	6	780	452.4	78	7	546
(0.10, 0.020, 0.8)	512.4	122	3	366	340.2	81	3	243
(0.10, 0.040, 0.2)	880.4	142	8	1136	546.0	78	10	780	1674.0	186	6	4464	999.0	111	6	2664
(0.10, 0.040, 0.6)	626.4	108	7	756	405.0	75	6	450
(0.10, 0.040, 0.8)	450.0	90	5	450	300.0	60	5	300

Open in a new tab

treatment effect=0.2, outcome variance=1; $α_{0} :$ within-period correlation; $α_{1} :$ inter-period correlation; $α_{2} :$ within-participant correlation; $T :$ number of time-periods; $L :$ number of treatment sequences; cost per cluster $c = 3000$ ; for closed-cohort LCRTs, cost per participant $s = 200$ ; cost per time-period $e = 50$ ; $total cost = m (c + s n + e T n)$ ; for repeated cross-sectional LCRTs, cost per participant $s = 250$ ; $total cost = m [c + (s + e) T n]$ ; unit in the column of Cost (k); $m :$ total number of clusters; $m_{m a x} = 5000$ ; $n :$ cluster-period size (number of participants per cluster per period); $n_{m a x} = 5000$ ; $N :$ total sample size. only $(α_{0}, α_{1})$ is needed for repeated cross-sectional trials.

Figure 1 present the required $\{{T C}_{O D}, m_{O D}, n_{O D}, N_{O D}\}$ for various closed-cohort SW-CRTs. Each plot includes multiple lines for different values of $L$ and correlation $(α_{0}, α_{1}, α_{2})$ . Within each plot, we hold two out of the three correlation values constant and vary the third. For example, $(α_{1}, α_{2}) = (0.020, 0.6)$ in Figure 1A–1D, $(α_{0}, α_{2}) = (0.05, 0.6)$ in Figure 1E–1H, and $(α_{0}, α_{1}) = (0.05, 0.020)$ in Figure 1I–1L. The total cost, optimal number of clusters, and total sample size decrease with increasing $L$ ; however, as $L$ varies, no patterns are observed for the optimal cluster-period size. In addition, as $α_{0}$ increases, the total cost, optimal number of clusters, and total sample size generally increase, whereas the optimal cluster-period size generally decreases for specific values of $L$ . In contrast, as $α_{1}$ increases, the total cost, optimal number of clusters, and total sample size generally decrease, whereas the optimal cluster-period size generally increases for specific values of $L$ . As $α_{2}$ increases, the total cost, optimal number of clusters, optimal cluster-period size, and total sample size generally decrease for specific values of $L$ . Figure 2 present the same information as above, but for repeated cross-sectional SW-CRTs. These figures illustrate that the same conclusions can be made for repeated cross-sectional SW-CRTs as for closed-cohort SW-CRTs.

Optimal designs with the lowest cost and 80% power to detect a treatment effect size of 0.2 and variance of $σ^{2} = 1$ for known association parameter $(α_{0}, α_{1}, α_{2})$ assuming cost per cluster $c = 3000$ , cost per participant $s = 200,$ and cost per time-period $e = 50$ in closed-cohort SW-CRTs with $T = 4$

Extending these analyses to additional values of $T$ (See Supplementary Figures 1–10), we see that the optimal number of clusters and total sample size decrease with increasing $L$ and reach a minimum at $L = T - 1$ for $T \leq 6$ . When $T > 6$ , these values are not at a minimum for $L = T - 1$ , but they are often very close to the minimum. Therefore, we recommend using $L = T - 1$ in SW-CRTs to yield the lowest cost. We also use this value in the remaining subsections when we compare optimal sample size calculations across the six designs of interest.

3.3. Comparing ODs with the lowest cost across six LCRTs

In this subsection, we focus our analysis on $T \geq 4$ because: 1) SW-CRTs require that $2 \leq L \leq T - 1,$ so $T$ must be ≥3; and 2) Equation (2.1) requires an even number of $T$ , making 4 the minimum value of $T$ that can be used for all LCRTs. We also limit our investigation to $T \leq 12$ because $T$ is generally a relatively small number in practice. Specifically, in a review of 160 published SW-CRTs between 2016 and 2022, the interquartile range of the number of sequences is (4, 7), which is included in our choice of design parameters. This subsection assumes the same desired levels of type I error and power, treatment effect, and unit costs as in Section 3.2.

3.3.1. Comparing ODs at a fixed value of $T$

Table 5 details the required $\{{T C}_{O D}, m_{O D}, n_{O D}, N_{O D}\}$ for known association parameters $(α_{0}, α_{1}, α_{2})$ and all six designs at $T = 4$ . For example, for $(α_{0}, α_{1}, α_{2}) = (0.05, 0.020, 0.2)$ , the OD for a closed-cohort PA-LCRT has a total cost of 358.8k, a total number of clusters of 46, a cluster-period size of 12, and a total sample size of 552, whereas for a repeated cross-sectional PA-LCRT, these values are 480.0k, 60, 5, and 1200, respectively. For a closed-cohort CRXO trial, the OD has a total cost of 144.0k, a total number of clusters of 16, a cluster-period size of 15, and a total sample size of 240, whereas for a repeated cross-sectional CRXO trial, these values are 330.0k, 22, 12, and 1056, respectively. Finally, for a closed-cohort SW-CRT, the OD has a total cost of 418.2k, a total number of clusters of 51, a cluster-period size of 13, and a total sample size of 663, whereas for a repeated cross-sectional SW-CRT, these values are 840.0k, 84, 7, and 2352, respectively.

Table 5.

Optimal designs with the lowest cost to obtain 80% power for six designs with $T = 4$ for known association parameter $(α_{0}, α_{1}, α_{2})$

Association Parameter $(α_{0}, α_{1}, α_{2})$	PA-LCRTs								Multiple-period CRXO trials								SW-CRTs ( $L = 3$ )
	Closed-cohort				Repeated cross-sectional				Closed-cohort				Repeated cross-sectional				Closed-cohort				Repeated cross-sectional
	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$
(0.05, 0.020, 0.2)	358.8	46	12	552	480.0	60	5	1200	144.0	16	15	240	330.0	22	12	1056	418.2	51	13	663	840.0	84	7	2352
(0.05, 0.020, 0.6)	514.8	66	12	792					92.4	14	9	126					304.2	39	12	468
(0.05, 0.020, 0.8)	582.8	62	16	992					64.8	12	6	72					217.8	33	9	297
(0.05, 0.040, 0.2)	429.2	74	7	518	560.0	80	4	1280	107.2	8	26	208	264.0	12	19	912	349.8	33	19	627	780.0	60	10	2400
(0.05, 0.040, 0.6)	602.0	86	10	860					63.6	6	19	114					216.0	24	15	360
(0.05, 0.040, 0.8)	680.8	92	11	1012					42.0	6	10	60					140.4	18	12	216

Open in a new tab

treatment effect=0.2, outcome variance=1; $α_{0} :$ within-period correlation; $α_{1} :$ inter-period correlation; $α_{2} :$ within-participant correlation; $T :$ number of time-periods; $L :$ number of treatment sequences; cost per cluster $c = 3000$ ; for LCRTs, cost per participant $s = 200$ ; cost per time-period $e = 50$ ; $total cost = m (c + s n + e T n)$ ; for repeated cross-sectional LCRTs, cost per participant $s = 250$ ; $total cost = m [c + (s + e) T n]$ ; unit in the column of Cost (k); $m :$ total number of clusters; $m_{m a x} = 5000$ ; $n :$ cluster-period size (number of participants per cluster per period); $n_{m a x} = 5000$ ; $N :$ total sample size. only $(α_{0}, α_{1})$ is needed for repeated cross-sectional trials.

Figure 3 present the ODs with the lowest cost under $T = 4$ for varying correlation values. The estimates, $\{{T C}_{O D}, m_{O D}, n_{O D}, N_{O D}\},$ from optimal repeated cross-sectional PA-LCRTs, CRXO trials and SW-CRTs are constant (Figure 3I–3L) because $α_{1} a n d α_{2}$ are constant and equal to each other.

3.3.1.1. Comparing total cost under ODs at a fixed value of $T = 4$

As $α_{0}$ increases, we observe that the total cost is increasing for the six designs, whereas as $α_{1}$ or $α_{2}$ increases, the total cost increase for optimal PA-LCRTs only.

Comparing the OD for different trial designs, the trial design with the lowest cost is an optimal closed-cohort CRXO trial, followed by an optimal repeated cross-sectional CRXO trial and an optimal closed-cohort SW-CRT. Optimal repeated cross-sectional SW-CRTs have a highest total cost than the other trial designs. Comparing the OD for CRXO trials and SW-CRTs, optimal closed-cohort designs always require a lower cost than repeated cross-sectional trials. In contrast, optimal closed-cohort PA-LCRTs may require a higher cost than optimal repeated cross-sectional PA-LCRTs.

3.3.1.2. Comparing optimal number of clusters and cluster-period size at a fixed value of $T = 4$

In addition to the total cost, the number of clusters and the cluster-period size are also important considerations, with the number of clusters generally playing a more important role than cluster-period size. As $α_{0}$ increases, we observe that the optimal number of clusters is generally increasing, whereas as $α_{1}$ or $α_{2}$ increases, the number of clusters generally increases for optimal PA-LCRTs only. Optimal cluster-period size is generally decreasing as $α_{0}$ increases and increasing as $α_{1}$ increases. As $α_{2}$ increases, it increases for optimal closed-cohort PA-LCRTs only.

Comparing the OD for different trial designs, optimal closed-cohort CRXO trials require the smallest number of clusters, followed by optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs. Optimal repeated cross-sectional SW-CRTs sometimes return the largest number of clusters. Comparing the OD for CRXO trials and SW-CRTs, optimal closed-cohort designs require a much smaller number of clusters than repeated cross-sectional trials, given the same values of other design parameters.

3.3.1.3. Comparing the total sample size under ODs at a fixed value of $T = 4$

A final feasibility consideration is the total sample size. As $α_{0}$ increases, we observe that the total sample size is generally non-decreasing, whereas as $α_{2}$ increases, the total sample size generally increases for optimal PA-LCRTs. As $α_{1}$ increases, total sample size does not vary much for all six ODs.

Comparing the ODs for different trial designs, the trial design with the smallest total sample size is an optimal closed-cohort CRXO trial and the trial design with the largest total sample size is an optimal repeated cross-sectional SW-CRT. Optimal closed-cohort designs require a much smaller total sample size than repeated cross-sectional trials.

3.3.2. Comparing ODs at different values of $T$

Table 6 presents the required $\{{T C}_{O D}, m_{O D}, n_{O D}, N_{O D}\}$ for known association parameter $(α_{0} = 0.05, α_{1} = 0.020, α_{2} = 0.6)$ across all six designs at different values of $T$ and Figure 2A–2D visualize these results. As $T$ increases, the total costs for optimal closed-cohort PA-LCRTs generally increases. In contrast, the opposite pattern is observed for the other five ODs. Overall, optimal closed-cohort CRXO trials, optimal closed-cohort SW-CRTs, and optimal repeated cross-sectional CRXO have lower costs than the other ODs. Optimal closed-cohort CRXO trials require the smallest number of clusters, followed by optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs.

Table 6.

Optimal designs with the lowest cost to obtain 80% power for six designs with $α_{0} = 0.05$ , $α_{1} = 0.020$ and $α_{2} = 0.6$

$T$	PA-LCRTs								Multiple-period CRXO trials								SW-CRTs ( $L = T - 1$ )
	Closed-cohort				Repeated cross-sectional				Closed-cohort				Repeated cross-sectional				Closed-cohort				Repeated cross-sectional
	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$	Cost	$m$	$n$	$N$
4	514.8	66	12	792	480.0	60	5	1200	92.4	14	9	126	330.0	22	12	1056	304.2	39	12	468	840.0	84	7	2352
6	558.0	62	12	744	465.0	62	3	1116	70.0	10	8	80	297.0	18	9	972	210.0	30	8	240	675.0	45	8	2160
8	612.0	60	12	720	450.0	50	3	1200	61.2	6	12	72	286.0	22	5	880	176.4	21	9	189	616.0	56	4	1792
10	669.6	72	9	648	448.0	56	2	1120	51.6	6	8	48	276.0	12	8	960	154.8	18	8	144	585.0	45	4	1800
12	726.0	66	10	660	450.0	50	2	1200	46.8	6	6	36	270.0	10	8	960	147.4	11	13	143	594.0	33	5	1980

Open in a new tab

3.4. Summary

Considering all parameters of interest – TC, number of clusters, cluster period size, and total sample size, our numerical studies find that the optimal closed-cohort CRXO trials perform the best across the six ODs. However, depending on whether TC or other factors related to enrollment feasibility are more important (e.g., number of clusters), we also recommend optimal repeated cross-sectional CRXO trials and closed-cohort SW-CRTs as strong alternatives. In Appendix 1, we summarize and propose the ODs with the highest power under a fixed budget for each of the six designs and then compare ODs across these designs. We make the same conclusion: the optimal closed-cohort CRXO trials performs the best, followed by the optimal repeated cross-sectional CRXO trials. We also notice that the integer estimates from OD algorithms are very close to decimal estimates from OD formula.

4. An example

This section re-designs the CRTs for the Prevention of Suicide in Primary Care Elderly: Collaborative Trial (PROSPECT), where the unit of randomization is primary care practice and all patients within a primary care practice receive the same assignment (either intervention or usual care).⁴⁵ The trial aims to determine the effect of a primary care intervention on suicidal ideation and depression in older patients and collects the depression severity at baseline, 4 months, 8 months, and 12 months using 24-item Hamilton Depression Rating Scale (HDRS). We consider $T = 4$ and use the same parameter settings as in Wang et al.,⁹ with a treatment effect of 1, a standard deviation of 6, and $(α_{0}, α_{1}, α_{2}) = (0.03, 0.015, 0.3)$ in addition to the assumptions of the unit costs $c = 3000, s = 200,$ and $e = 50$ .

First, we identify an optimal design with the lowest cost to obtain at least 80% power at the two-sided significance level of 5%. The number of primary care practices $m = 56$ , cluster-period size (number of patients per primary care practice) $n =$ 15, and the total cost is 504k for an optimal closed-cohort PA-LCRT; $m = 68, n =$ 6, and the total cost is 612k for an optimal repeated cross-sectional PA-LCRT; $m = 14, n =$ 20, and the total cost is 154k for an optimal closed-cohort CRXO trial; $m = 24, n =$ 14, and the total cost is 408k for an optimal repeated cross-sectional CRXO trial; $m = 48, n =$ 17, and the total cost is 470.4k for an optimal closed-cohort SW-CRT and $m = 72, n =$ 12, and the total cost is 1080k for an optimal repeated cross-sectional SW-CRT. Therefore, the optimal closed-cohort CRXO trial, followed by the optimal repeated cross-sectional CRXO trial and optimal closed-cohort SW-CRT, has the lowest cost.

Second, we consider $B = 408 k$ as a given total budget and aim to identify the optimal design with the highest power. An optimal closed-cohort PA-LCRT returns $m = 52, n = 1$ 2 with 71.3% power and an actual cost of 405.6k; an optimal repeated cross-sectional PA-LCRT returns $m = 40, n =$ 7 with 62.6% power and an actual cost of 400k; an optimal closed-cohort CRXO trial returns $m = 40, n =$ 18 with 99.6% power and an actual cost of 408k; an optimal repeated cross-sectional CRXO trial returns $m = 24, n =$ 14 with 80.3% power and an actual cost of 408k; an optimal closed-cohort SW-CRT returns $m = 45, n =$ 15 with 74.0% power and an actual cost of 405k; and an optimal repeated cross-sectional SW-CRT returns $m = 27, n =$ 12 with 40.7% power and an actual cost of 405k. The optimal closed-cohort CRXO trial is the best, with the highest power within the budget, followed by the optimal repeated cross-sectional CRXO trial and optimal closed-cohort SW-CRT.

5. Discussion

In this paper, we discuss the OD for six possible LCRT designs: PA-LCRTs, multiple-period CRXO trials, and SW-CRTs including closed-cohort and repeated cross-sectional designs with two treatment conditions and continuous outcomes. Sample size formulae for these trials and ODs for some of these trials have been developed separately in prior research; however, research comparing ODs across trial designs is limited. Our contributions to the optimal CRT design literature are several folded. First, we propose algorithms to identify the ODs with the lowest cost for desired levels of power for all six study designs. For SW-CRTs, in particular, we suggest using the number of treatment sequences, $L = T - 1$ , where $T$ is the number of time-periods, to yield the lowest cost. Second, under a fixed budget, we summarize and propose the ODs with the highest power for each of the six study designs and provide the ETE variance formulae under the OD for PA-LCRTs and multiple-period CRXO trials. Third, our findings suggest that optimal closed-cohort CRXO trials have the lowest cost, followed by optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs. In fact, we confirm that optimal closed-cohort CRXO trials excelled in both ODs with different optimization metrics, and are global ODs. Fourth, given the suggestion that $L = T - 1$ in both types of SW-CRTs under the OD with the highest power,¹⁴ we compare the six ODs and find that 1) optimal closed-cohort CRXO trials perform the best with the highest power; 2) optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs are the next two best designs with highest power; and 3) decimal estimates show a more obvious trend than integer estimates for PA-LCRTs and CRXO trials. Finally, it is important to notice that, while our development is based on GEE, our results should be equally applicable to trials analyzed by linear mixed models with matching random-effects structures as we have primary focused on a continuous outcome. For example, the sample size formulae have been shown to be identical for GEE and linear mixed models with comparable correlation structures for repeated cross-sectional and closed-cohort SW-CRTs (see, for example, Appendix D in Chen e al.).⁴⁶

There are several limitations to this research. First, we only discussed closed-cohort and repeated cross-sectional design as well as assume complete measurements from each participant. Open cohort designs, hybrid designs, continuous recruitment continuous exposure designs and incomplete designs were not included and could be important future work.^7,47–50. Second, our comparisons of ODs are based on locally optimal designs under an assumed set of ICC values. In general, when historical or routinely collected data are available during the study planning stage, one may be able to fit the GEEs or linear mixed models to compute the ICC values, based on methods discussed in Zhang et al. and Ouyang et al.^51,52 Without such preliminary data, a common practice is to elicit the ICC values from the published literature. Specifically in the context of longitudinal CRTs, Korevaar et al. reported and summarized the ICC estimates from a collection of completed trials—the CLustered OUtcome Dataset bank—for a wide range of outcomes to inform future study planning.⁵³ Addressing uncertainty in the ICC values for study planning, such as using the MaxiMin optimal design, has been pursued in Liu and Li for multiple-period CRXO trials and SW-CRTs, and a full comparison of MaxiMin optimal designs deserves future research.¹⁴ On the other hand, the cost parameters are also required input for our methods and are generally based on the context, type of clusters, and study budget. Precise cost estimates for recruiting a cluster, an individual participant, and for obtaining an outcome measurement would require collaboration between study investigators, health economists, and statisticians, and best practices for doing merit continued discussion in the context of longitudinal CRTs. Third, while our work addresses the basic setting under a constant treatment effect assumption, it is possible that the treatment effect may depend on the calendar time or exposure time across six LCRTs.^54–56 Our GEE model cannot address these complicated concerns and we will defer the development of ODs under a more complicated treatment effect structure to future research. Fourth, for SW-CRTs, for example, a SW-CRT with $T = 6$ and 3 treatment sequences, Section 3.2 only addressed treatment sequences of CIIIII, CCIIII, and CCCIII, where the number of intervention time-periods is 5, 4, and 3, respectively. However, alternative treatment sequence such as “CIIIII, CCCIII, CCCCCI” could also be an option, where the number of intervention time-periods is 5, 3, and 1, respectively. From the trial perspective, the investigator aims to maximize the time-periods of intervention. Thus, it should be reasonable to consider our proposed scenario. Given that this research restricts to these SW-CRTs when comparing across 6 designs, further efficiency improvement may be made when we allow for an additional layer of optimization across treatment sequences, such as those in Lawrie et al, and Li et al.^57,58 However, that type of optimization along with a local optimalization under a total budget can introduce additional computational challenges. Whether such additional maximization can substantially improve SW-CRTs is an open question for future research. We also assume that an equal number of clusters crossover to the intervention at each time-period for SW-CRTs. However, previous work has shown that equal allocation of clusters to sequences in stepped wedge designs is not optimal;^57–59 Recently, Watson et al. attempted more extensive ODs for CRTs through identifying three broad classes of methods and combining these algorithms to select an optimal subset of cluster sequences.⁶⁰ They determined the optimal allocation of clusters across a set of cluster sequences and the optimal cluster-period size for both Gaussian and non-Gaussian models using exchangeable and exponential decay covariance structures. It would therefore be interesting to combine considerations for treatment sequence optimization and for cost-efficiency in future work. Fifth, the proportion of clusters receiving the intervention assignment $π = 50 %$ in PA-CRTs and CRXO designs, and equal unit costs between two treatment conditions are assumed. For example, assuming that the cost per cluster recruitment is $c_{q}$ currency units, the cost per participant enrollment is $s_{q}$ currency units, and the cost per outcome measurement is $e_{q}$ currency units in treatment condition $q, q = 0, 1$ for closed-cohort trials, the total cost is $T C {= m}_{c o n t} (c_{0} + s_{0} n + e_{0} T n) + m_{t r t} (c_{1} + s_{1} n + e_{1} T n)$ . Therefore, future work could explore OD algorithms under alternative assumptions and address between-arm heterogeneity.

We reiterate that our results are developed for selecting optimal longitudinal cluster randomized designs with a continuous outcome, and they may not directly generalize to binary or count outcomes. Previously, Liu et al. considered count data in two-/three-level PA-CRTs and reviewed the optimal designs of two-/three-level PA-CRTs with a binary outcome utilizing GEEs.^{14,33,61–63} However, when the design matrix becomes more complicated (as in SW-CRTs), the variance of the treatment effect estimator is often expressed in matrix form, and no simple, scalar variance expression exists. With binary outcomes, Li et al. derived a matrix-based formula for sample size calculations in SW-CRTs, specifically leveraging an analytic inverse of the block exchangeable correlation matrix to facilitate efficient numerical computation of the variance.¹⁹ Building on that approach, future research could explore numerical methods, such as those in Li et al., ¹⁹ to identify ODs for SW-CRTs and extend our design comparison results to accommodate binary and count outcomes.

In conclusion, this paper fills an important gap in the current OD literature for six possible LCRT designs by discussing two different types of ODs, one with the lowest cost to obtain at a desired level of power and the other with the highest power given a fixed budget. We compare the ODs across these six designs and conclude that the optimal closed-cohort CRXO trial is the theoretically global OD. However, in practice, a CRXO design may not always be feasible depending on the intervention and study context. For example, when an intervention is difficult to de-implement, the investigators must choose SW-CRT designs rather than CRXO designs even if our findings shows that an optimal CRXO design is more efficient than an optimal SW-CRT design. Hence, in practice, it is possible that an OD under SW-CRT would be more appropriate and appealing and should remain a compelling design option.

Supplementary Material

Supplementary Figures-1

NIHMS2171977-supplement-Supplementary_Figures-1.pdf^{(331.8KB, pdf)}

Supplementary Figures-2

NIHMS2171977-supplement-Supplementary_Figures-2.pdf^{(406.9KB, pdf)}

Optimal designs with the lowest cost and 80% power to detect a treatment effect size of 0.2 and variance of $σ^{2} = 1$ for known association parameter $(α_{0} = 0.05, α_{1} = 0.020, α_{2} = 0.6)$ assuming cost per cluster $c = 3000$ , cost per participant $s = 200,$ and cost per time-period $e = 50$ across six designs

Acknowledgements

We thank the Alvin J. Siteman Cancer Center at Washington University School of Medicine and Barnes-Jewish Hospital in St. Louis, MO (P30 CA91842), National Institutes of Health (NIH) grant P50 CA244431, Institute of Clinical and Translational Sciences (ICTS) grant CTRFP2019-05, and a Patient-Centered Outcomes Research Institute Award^® (PCORI^® Award ME-2022C2-27676) for supporting this research. The content is solely the responsibility of the authors and does not necessarily represent the official view of the NIH, PCORI^®, or its Board of Governors or Methodology Committee.

Appendix. ODs with the highest power

Here, we focus on ODs with the lowest ETE variance and thus the highest power. In addition to considering design estimates based on integers (i.e. whole humans), which provide investigators with the optimal practical design, we also introduce theoretical ODs at an exact budget $B$ . In closed-cohort LCRTs,

B = m (c + s n + e T n) .

(4)

In repeated cross-sectional LCRTs,

B = m (c + s T n) .

(5)

The theoretical ODs allow decimal values for ${n, m}$ at an exact budget $B$ with Equation (4) or (5), whereas the practical ODs with integer values for ${n, m}$ have a total cost less than or equal to budget $B$ . Therefore, we can investigate the performance of practical ODs by comparing them with theoretical ODs.

1. OD formulae with the highest power for LCRTs

Following the proof in Liu et al.,¹⁴ the OD formulae for closed-cohort PA-LCRTs using ETE variance (1.1) is as follows

n_{O D} = \sqrt{\frac{ϑ c}{s + e T}}, m_{O D} = \frac{B}{\sqrt{ϑ (s + e T) c} + c},

(6)

where $ϑ = \frac{1 + (T - 1) α_{2}}{α_{0} + (T - 1) α_{1}} - 1$ . Recall that $λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2} = λ_{2} (1 + \frac{n}{ϑ}),$ where $λ_{2} = 1 - α_{0} - (T - 1) (α_{1} - α_{2})$ and $ϑ = \frac{1 + (T - 1) α_{2}}{α_{0} + (T - 1) α_{1}} - 1$ . Then the variance of the treatment effect estimator can be written as

V a r (β^{*}) = \frac{σ^{2} λ_{4}}{T n m π (1 - π)} = \frac{σ^{2} λ_{2} (1 + \frac{n}{ϑ})}{T π (1 - π) n m} .

Under the OD,

n_{O D} m_{O D} = \frac{B}{\sqrt{s + e T} (\sqrt{s + e T} + \sqrt{\frac{c}{ϑ}})}

and thus the ETE variance of this OD is

{V a r (β^{*})}_{O D} = \frac{σ^{2} λ_{4}}{T m n π (1 - π)} = \frac{σ^{2} λ_{2} (1 + \frac{1}{ϑ} \sqrt{\frac{ϑ c}{s + e T}})}{T π (1 - π) \frac{B}{\sqrt{s + e T} (\sqrt{s + e T} + \sqrt{\frac{c}{ϑ}})}} = \frac{σ^{2} λ_{2}}{T π (1 - π) B} \times {(\sqrt{s + e T} + \sqrt{\frac{c}{ϑ}})}^{2} .

Liu et al. also derived the following OD formulae for closed-cohort CRXO designs when $T$ is a predetermined value.¹⁴ It is the same as Equation (6) but with a different $ϑ = \frac{1 - α_{2}}{α_{0} - α_{1}} - 1$ . The ETE variance of this OD is

{V a r (β^{*})}_{O D} = \frac{σ^{2} λ_{1}}{T π (1 - π) B} \times {(\sqrt{s + e T} + \sqrt{\frac{c}{ϑ}})}^{2} .

This proof is provided in Appendix 2 of Liu et al.¹⁴

As described earlier in section 2.4, we set $α_{2} = α_{1}$ in repeated cross-sectional trials. Therefore, Equation (6) becomes $n_{O D} = \sqrt{\frac{ϑ c}{T s}}, m_{O D} = \frac{B}{\sqrt{ϑ T s c} + c},$ where $ϑ = \frac{1 - α_{0}}{α_{0} + (T - 1) α_{1}}$ and $ϑ = \frac{1 - α_{0}}{α_{0} - α_{1}}$ for repeated cross-sectional PA-LCRTs and CRXO trials, respectively. As $α_{0}$ increases, power is non-increasing for optimal PA-LCRTs and CRXO trials, whereas as $α_{1}$ increases, power is decreasing for optimal PA-LCRTs. Finally, as $α_{2}$ increases, power decreases for optimal closed-cohort PA-LCRTs but increases for optimal closed-cohort CRXO trials. OD formulae are not available for closed-cohort SW-CRTs because we cannot derive the closed-form formulae (e.g., Equation (6)) using variance (3.1). We can only derive it for PA-LCRTs and CRXO trials.

2. OD algorithms with the highest power for LCRTs

Although cluster-period size and the number of clusters must be integer values in practice, the expressions in Equation (6) can yield non-integer values. To obtain integer estimates, we propose the following OD algorithm for closed-cohort PA-LCRTs:

Step 1. Specify the design space $(2, n_{m a x})$ and $(2, m_{m a x})$ .

Step 2. For each combination of integer values of $n$ and $m$ in the design space,

calculate the eigenvalues, $λ_{1}, λ_{2}, λ_{3}, λ_{4}$ and check that the minimum $min \{λ_{1}, λ_{2}, λ_{3}, λ_{4}\} > 0,$ where $λ_{1} = 1 - α_{0} + α_{1} - α_{2}$ , $λ_{2} = 1 - α_{0} - (T - 1) (α_{1} - α_{2})$ , $λ_{3} = 1 + (n - 1) (α_{0} - α_{1}) - α_{2}$ , $λ_{4} = 1 + (n - 1) α_{0} + (T - 1) (n - 1) α_{1} + (T - 1) α_{2};$
calculate the total cost as $m (c + s n + e T n)$ and check that it is less than or equal to the budget $B$ ;
if conditions 1) and 2) are satisfied, calculate the variance (1.1).

Step 3. Select the design with the smallest variance in the design space, $\{n_{O D}, m_{O D}\}$ .

Liu et al. have proposed an OD algorithm to obtain integer estimates in CRXO trials and SW-CRTs.¹⁴ They are similar to the above algorithm expect that the ETE variance is different, where (1.1) in Step 2–3 is replaced by (2.1) and (3.1). The OD algorithms can also be applied to repeated cross-sectional trials. For closed-cohort and repeated cross-sectional SW-CRTs, Liu et al. showed that the optimal number of time-periods $T$ is equal to the number of treatment sequences $L$ plus 1.¹⁴

Appendix Table 1 summarizes all proposed ODs including the algorithms for estimating integer values and the formulae for estimating decimal values for PA-LCRTs, multiple-period CRXO trials and SW-CRTs.

3. Comparing ODs with the highest power across six LCRTs

For comparison across the six ODs with the highest power, we assume the same type I error, treatment effect, and unit costs as those described in Section 3.1 and a budget $B = 300,000$ for known association parameter $(α_{0}, α_{1}, α_{2})$ .

3.1. Comparing ODs at a fixed value of $T$

Appendix Table 2 details the obtained power, number of clusters $m_{O D}$ , optimal cluster-period size $n_{O D}$ , and total sample size $N_{O D}$ under the OD with the highest power for all six ODs assuming $T = 4$ . For example, for $(α_{0}, α_{1}, α_{2}) = (0.05, 0.020, 0.6)$ , the OD with the highest power for a closed-cohort PA-LCRT has an obtained power of 0.569, number of clusters of 38, cluster-period size of 12, and total sample size of 456. These values are 0.599, 30, 7, and 840, respectively, for a repeated cross-sectional PA-LCRT; 0.999, 48, 8, and 384, respectively, for a closed-cohort CRXO trial; 0.773, 20, 12, and 960, respectively, for a repeated cross-sectional CRXO trial; 0.796, 45, 9, and 405, respectively, for a closed-cohort SW-CRT; 0.390, 30, 7, and 840, respectively, for a repeated cross-sectional SW-CRT.

Appendix Figure 1 present the same estimates as provided in Appendix Table 2 but allow the correlation values to vary. Specifically, they keep two of the three correlation values $(α_{0}, α_{1}, α_{2})$ constant and allow the remaining correlation to increase. Appendix Figures 1I–1L present horizontal lines for repeated cross-sectional CRXO trials and SW-CRTs.

3.1.1. Comparing power under ODs at a fixed value of $T = 4$

As $α_{0}$ increases, power is non-increasing for six ODs, whereas as $α_{2}$ increases, power increases for optimal closed-cohort SW-CRTs.

Based on these figures, we can see clearly that optimal closed-cohort CRXO trials have the largest power, followed by optimal repeated cross-sectional CRXO trials and optimal closed-cohort SW-CRTs.

3.1.2. Comparing optimal number of clusters and cluster-period size at a fixed value of $T = 4$

With respect to the optimal number of clusters, closed-cohort CRXO trials have one of the largest numbers, and repeated cross-sectional CRXO trials have one of the smallest numbers.

As $α_{0}$ increases, the number of clusters are non-decreasing, whereas the cluster-period size is non-increasing for all six ODs. As $α_{1}$ increases, 1) the optimal number of clusters is non-decreasing and the optimal cluster-period size is non-increasing for PA-LCRTs; and 2) the optimal number of clusters is non-increasing and the optimal cluster-period size is non-decreasing for CRXO trials and SW-CRTs. Finally, as $α_{2}$ increases, 1) the optimal number of clusters is non-increasing and the optimal cluster-period size is non-decreasing for closed-cohort PA-LCRTs; and 2) the optimal number of clusters is non-decreasing and the optimal cluster-period size is non-increasing for closed-cohort CRXO trials and SW-CRTs.

3.1.3. Comparing the total sample size under ODs at a fixed value of $T = 4$

Repeated cross-sectional CRXO trials and SW-CRTs have the largest total sample sizes. As $α_{0}$ increases, the total sample sizes are non-increasing for all six ODs. As $α_{1}$ increases, the total sample size is 1) non-increasing for PA-LCRTs; and 2) non-decreasing for CRXO trials and SW-CRTs. As $α_{2}$ increases, the total sample size is 1) non-decreasing for closed-cohort PA-LCRTs; and 2) non-increasing for closed-cohort CRXO trials and SW-CRTs.

3.2. Comparing ODs at different values of $T$

Appendix Table 3 extends the above analyses by varying values of $T$ , while holding the correlation parameters constant at $(α_{0} = 0.05, α_{1} = 0.020, α_{2} = 0.6)$ for all six ODs. Appendix Figure 2 presents the corresponding results visually.

As $T$ increases, we see that: 1) power decreases for two of the PA-LCRTs and increases for the other four designs. 2) the optimal number of clusters generally decreases. 3) the optimal cluster-period size generally decreases. 4) the total sample size generally decreases.

Appendix Tables 4–5 and Appendix Figures 3–4 show the corresponding decimal estimates for four of the trial designs: PA-LCRTs and multiple-period CRXO trials including closed-cohort and repeated cross-sectional designs. For a fixed $T$ , the optimal CRXO trials have higher power than the optimal PA-LCRTs. The trial with the optimal closed-cohort CRXO trials reaches the highest power, followed by the optimal repeated cross-sectional CRXO trials. The optimal repeated cross-sectional trials have the largest total sample size.

For a fixed $T$ , 1) as $α_{0}$ increases, the power decreases, optimal number of clusters increases, optimal cluster-period size decreases, and total sample size decreases. 2) as $α_{1}$ increases, the power decreases for optimal PA-LCRTs but increases for CRXO trials; the optimal number of clusters increases for PA-LCRTs but decreases for CRXO trials, whereas the optimal cluster-period size decreases for PA-LCRTs but increases for CRXO trials; the total sample size decreases for optimal PA-LCRTs but increases for optimal CRXO trials; 3) as $α_{2}$ increases, the power decreases for optimal closed-cohort PA-LCRTs, whereas the power increases for closed-cohort CRXO trials; the optimal number of clusters decreases for closed-cohort PA-LCRT but increases for closed-cohort CRXO trials, whereas the optimal cluster-period size increases for closed-cohort PA-LCRTs but decreases for closed-cohort CRXO trials; the total sample size increases for closed-cohort PA-LCRTs but decreases for closed-cohort CRXO trials.

With the increase of $T$ , 1) the power decrease for optimal closed-cohort designs but increase for optimal repeated cross-sectional designs. 2) optimal cluster-period size and number of clusters are decreasing for all four ODs. 3) the total sample size decrease for optimal closed-cohort designs but increase for optimal repeated cross-sectional designs.

Appendix Figure 1 — Optimal designs with the highest power to detect a treatment effect size of 0.2 and variance of $σ^{2} = 1$ for known association parameter ( $α_{0}, α_{1}, α_{2}$ ) assuming a budget $B = 300,000$ , cost per cluster $c = 3000$ , cost per participant $s = 200$ , and cost per time-period $e = 50$ across six designs with $T = 4$

Appendix Figure 2 — Optimal designs with the highest power to detect a treatment effect size of 0.2 and variance of $σ^{2} = 1$ for known association parameter ( $α_{0} = 0. 05$ , $α_{1} = 0. 020$ , $α_{2} = 0. 6$ ) assuming a budget $B = 300,000$ , cost per cluster $c = 3000$ , cost per participant $s = 250$ , and cost per time-period $e = 50$ across six designs

Appendix Figure 3 — Optimal designs with the highest power to detect a treatment effect size of 0.2 and variance of $σ^{2} = 1$ for known association parameter ( $α_{0}, α_{1}, α_{2}$ ) assuming a budget $B = 300,000$ , cost per cluster $c = 3000$ , cost per participant $s = 200$ , and cost per time-period $e = 50$ across four designs with $T = 4$ (Decimal estimates)

Appendix Figure 4 — Optimal designs with the highest power to detect a treatment effect size of 0.2 and variance of $σ^{2} = 1$ for known association parameter ( $α_{0} = 0. 05$ , $α_{1} = 0. 020$ , $α_{2} = 0. 6$ ) assuming a budget $B = 300,000$ , cost per cluster $c = 3000$ , cost per participant $s = 200$ , and cost per time-period $e = 50$ across six designs (Decimal estimates)