Sample size determinations for group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms

Moonseong Heo; Alain H Litwin; Oni Blackstock; Namhee Kim; Julia H Arnsten

doi:10.1177/0962280214547381

. Author manuscript; available in PMC: 2018 Feb 1.

Published in final edited form as: Stat Methods Med Res. 2016 Jul 11;26(1):399–413. doi: 10.1177/0962280214547381

Sample size determinations for group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms

Moonseong Heo ¹, Alain H Litwin ^2,³, Oni Blackstock ², Namhee Kim ⁴, Julia H Arnsten ^1,^2,³

PMCID: PMC4329103 NIHMSID: NIHMS660592 PMID: 25125453

Abstract

We derived sample size formulae for detecting main effects in group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms. Such designs are necessary when experimental interventions need to be administered to groups of subjects whereas control conditions need to be administered to individual subjects. This type of trial, often referred to as a partially nested or partially clustered design, has been implemented for management of chronic diseases such as diabetes and is beginning to emerge more commonly in wider clinical settings. Depending on the research setting, the level of hierarchy of data structure for the experimental arm can be three or two, whereas that for the control arm is two or one. Such different levels of data hierarchy assume correlation structures of outcomes that are different between arms, regardless of whether research settings require two or three level data structure for the experimental arm. Therefore, the different correlations should be taken into account for statistical modeling and for sample size determinations. To this end, we considered mixed-effects linear models with different correlation structures between experimental and control arms to theoretically derive and empirically validate the sample size formulae with simulation studies.

Keywords: Group-based intervention, sample size, multi-level data, mixed-effects model, varying sizes

1 Introduction

In clinical trials, interventions are often administered to groups of subjects in an effort to enhance their effects on study outcomes through facilitating social support and reinforcing healthy behaviors among peers.¹ Such group-based treatment models are currently being utilized in clinical care, and are beginning to emerge more commonly.^{2, 3} For example, clinical trials aiming to test the efficacy of group clinical visits (i.e. a group of patients seen simultaneously by a health care provider) have been conducted for the management of diabetes,^4–6 hepatitis C,⁷ smoking cessation,⁸ and other medical or psychological conditions.^9–12 Despite such increasing adoption and implementation of group-based interventions in clinical settings, development of rigorous methods to assess statistical power or to determine sample sizes for such approaches has been lacking.

Group-based interventions are usually compared to control interventions that are administered at the individual level of ungrouped participants. This unique aspect imposes a challenge for statistical power assessment or sample size determinations since the hierarchies of data structures between experimental and control arms are different unlike the case of conventional trials which typically assume identical data structures between arms. For example, when groups are randomly selected for the experimental arm and ungrouped individual subjects are randomly selected for the control arm, the levels of hierarchy for the experimental and control arms will be two and one, respectively. This type of design is often referred to as a partially nested or partially clustered design. In such studies, the differences in correlations should be taken into account for both statistical modeling and sample size determinations as illustrated by Roberts and Roberts.¹³ For partially clustered designs under two-level data structures, Bauer et al.¹⁴ discussed several approaches and Baldwin et al.¹⁵ evaluated those statistical models in terms of bias of variance components, type I error and power with extensive simulations, yet without theoretical derivations. Moerbeek and Wong¹⁶ theoretically derived sample size formulae for both continuous and binary outcome data with equal group sizes but did not validate these with simulation studies.

In this study, we extend the partially clustered design to larger scale trials that use multiple centers in which groups and individual subjects are nested. Thus, the levels of hierarchy for the experimental and control arms will be three and two, respectively. The aim of this paper was to derive power functions for testing main effects of group-based interventions in comparison to individual-based control conditions based on two- and three-level mixed-effects linear models. Both equal and varying cluster sizes are considered, and simulation studies are conducted for validation of derived sample formulae. Throughout this paper, the continuous outcome will be denoted by Y, and the arm indicator will be denoted by X = 0 for control arm and =1 for experimental arm. The number of nested units for each level will vary across nesting units. The sets of indices indicating observations from the experimental and control arm subjects will be denoted by E and C, respectively. Although the nomenclatures for units of levels should depend on study context, here we refer to “center”, “group”, and “subject” as the third, the second, and the first level data units, respectively.

2 Two-level model

2.1 Statistical model

When the group-based experimental arm assumes a two level data structure, and the control arm assumes a one level data structure, a pertinent mixed-effects linear model can be formulated as follows

Y_{j k}^{(2)} = β_{0} + δ^{(2)} X_{j k} + u_{j} X_{j k} + e_{j k}

(1)

The two sets, E and C, are defined as: E = {j, k | X_jk = 1} and C = {j, k | X_jk = 0}. Groups in the experimental arm are indexed by j = 1, 2, …, J for j ∈ E so that #{j | j ∈ E} = J, where #{.} denotes the number of elements in the set {.}. Although there are no groups in the control arm, we assign “pseudo” group indices for observations from the control arm subjects as follows: j = J + 1, J + 2, …, J + J′ for j ∈ C so that #{j | j ∈ C} = J′. Subjects are indexed by k = 1, 2, …, K_j and K_j > 1 for j ∈ E and K_j = 1 for j ∈ C so that each subject serves as his/her own group in the control arm. The total number of subjects in the experimental arm is denoted by $N_{E} = \sum_{j = 1}^{J} K_{j}$ , whereas that in the control arm is denoted by $N_{C} = \sum_{j = J + 1}^{J + J^{'}} K_{j} = J^{'}$ .

The fixed-effect parameters β₀ and δ⁽²⁾ represent the intercept and the main effect of the experimental intervention on the outcome Y⁽²⁾, respectively. Group-specific random effects in the experimental arm are denoted by $u_{j} ~ N (0, σ_{2}^{2})$ for j ∈ E. The random noise is denoted by e_jk which is assumed to be $e_{j k} ~ N (0, σ_{e}^{2})$ . Furthermore, although u_j and e_jk are assumed to be mutually independent (mutual independence assumption), the elements of e_jk are assumed to be independent over k for given u_j (conditional independence assumption). We assume that the magnitudes of all variance components in model (1) are known.

Under these assumptions, it can be shown that $E (Y_{j k}^{(2)}) = β_{0} + δ^{(2)} X_{j k}$ and $Var (Y_{j k}^{(2)}) = σ_{e}^{2} + σ_{2}^{2} X_{j k}$ . Let $σ^{2} \equiv Var (Y_{j k}^{(2)} ∣ X_{j k} = 1) = σ_{e}^{2} + σ_{2}^{2}$ denote the variance of Y⁽²⁾ in the experimental arm, then the intra-class correlation coefficient (ICC) of outcome Y within groups in the experimental arm can be expressed as

ρ \equiv Corr (Y_{j k}, Y_{j k^{'}}) = σ_{2}^{2} / σ^{2}

(2)

for k ≠ k′ and j ∈ E. The variance of Y⁽²⁾ in the control arm is simply equal to $σ_{e}^{2}$ , i.e. $Var (Y_{j k}^{(2)} ∣ X_{j k} = 0) = σ_{e}^{2} = (1 - ρ) σ^{2} \leq σ^{2}$ for j ∈ C. The null hypothesis for testing the significance of the main effect of experimental intervention is H₀: δ⁽²⁾ = 0.

2.2 Parameter estimates and variances

An estimate of δ⁽²⁾ under model (1) can be obtained as

{\hat{δ}}^{(2)} = {\bar{Y}}_{E}^{(2)} - {\bar{Y}}_{C}^{(2)}

where

{\bar{Y}}_{E}^{(2)} = \sum_{j = 1}^{J} \sum_{k = 1}^{K_{j}} Y_{j k} / \sum_{j = 1}^{J} K_{j} = \sum_{j = 1}^{J} \sum_{k = 1}^{K_{j}} Y_{j k} / N_{E} and {\bar{Y}}_{C}^{(2)} = \sum_{j = J + 1}^{J + J^{'}} Y_{j k} / \sum_{j = J + 1}^{J + J^{'}} K_{j} = \sum_{j = J + 1}^{J + J^{'}} Y_{j k} / N_{C}

are the sample means of Y⁽²⁾ in the experimental and control arms, respectively. The variances of these means can be expressed as

Var ({\bar{Y}}_{E}^{(2)}) = σ^{2} / N_{E} + (\sum_{j = 1}^{J} K_{j}^{2} / N_{E}^{2} - 1 / N_{E}) σ_{2}^{2} = σ^{2} {(1 - ρ) / N_{E} + ρ \sum_{j = 1}^{J} K_{j}^{2} / N_{E}^{2}}

and

Var ({\bar{Y}}_{C}^{(2)}) = σ_{e}^{2} / N_{C} = (1 - ρ) σ^{2} / N_{C} .

It follows that

Var ({\hat{δ}}^{(2)}) = Var ({\bar{Y}}_{E}^{(2)}) + Var ({\bar{Y}}_{C}^{(2)}) = σ^{2} {(1 - ρ) (1 / N_{E} + 1 / N_{C}) + ρ \sum_{j = 1}^{J} K_{j}^{2} / N_{E}^{2}}

(3)

2.3 Test statistics, power functions and sample size formulae

The following test statistic D⁽²⁾ can be used to test the null hypothesis H₀: δ⁽²⁾ = 0

D^{(2)} = \frac{{\hat{δ}}^{(2)}}{\sqrt{Var ({\hat{δ}}^{(2)})}} = \frac{{\bar{Y}}_{E}^{(2)} - {\bar{Y}}_{C}^{(2)}}{σ \sqrt{(1 - ρ) (1 / N_{E} + 1 / N_{C}) + ρ \sum_{j = 1}^{J} K_{j}^{2} / N_{E}^{2}}} .

The power function D⁽²⁾ at a two-sided significance level of α can be expressed as follow

φ^{(2)} \equiv φ (D^{(2)}) = Φ {Δ_{(2)} / \sqrt{(1 - ρ) (1 / N_{E} + 1 / N_{C}) + ρ \sum_{j = 1}^{J} K_{j}^{2} / N_{E}^{2}} - Φ^{- 1} (1 - α / 2)}

(4)

where Δ₍₂₎ = |δ⁽²⁾/σ| is a standardized effect size or Cohen’s d¹⁷ and Φ is the cumulative distribution function (CDF) of a standard normal distribution. Determination of varying group sizes K_j’s, or N_E, for fixed N_C can be made by solving equation (4) iteratively for K_j’s. On the other hand, determination of sample size N_C for fixed group sizes K_j’s, or N_E should be straightforward by solving equation (4) for N_C.

For special cases of equal number of units, sample size determinations are much more tractable. Suppose that group sizes are equal in the experimental arm, i.e. K_j = K for all j ∈ E so that N_E = JK, and that total numbers of subjects are the same between the two arms, i.e. N_E = JK = N_C, then the variance (3) can be reduced to

Var ({\hat{δ}}_{E}^{(2)}) = \frac{σ^{2}}{J K} {2 + ρ (K - 2)}

which simplifies the power function (4) to

φ^{(2)} = Φ {Δ_{(2)} \sqrt{\frac{J K}{2 + ρ (K - 2)}} - Φ^{- 1} (1 - α / 2)}

(5)

It follows that the number of groups for fixed group sizes in E can be expressed as

J = \frac{{2 + ρ (K - 2)} z_{α, φ}^{2}}{K Δ_{(2)}^{2}}

(6)

and the group sizes for fixed number of groups in E can be expressed as

K = \frac{2 (1 - ρ) z_{α, φ}^{2}}{J Δ_{(2)}^{2} - ρ z_{α, φ}^{2}}

(7)

for $ρ < J Δ_{(2)}^{2} / z_{α, φ}^{2}$ , where

z_{α, φ} = Φ^{- 1} (1 - α / 2) + Φ^{- 1} (φ)

(8)

and Φ⁻¹ is the inverse CDF of a standard normal distribution. We note that when K needs to be determined for a desired level of power for a given J, it is possible that K cannot be determined, especially when J is small and ρ (2) is large, resulting in a combination which can make the denominator of equation (7) negative. Therefore, when only a limited number of groups J are feasible to form, the correlation ρ in particular must be small enough for $ρ < J Δ_{(2)}^{2} / z_{α, φ}^{2}$ to be true in equation (7).

3 Three-level model

3.1 Statistical model

When the group-based intervention arm assumes a three level data structure while the control arm assumes a two level data structure, model (1) can be extended as follow

Y_{ijk}^{(3)} = β_{0} + δ^{(3)} X_{ijk} + u_{i} + u_{j (i)} X_{ijk} + e_{ijk}

(9)

The two sets are defined as E = {i, j, k | X_ijk = 1} and C = {i, j, k | X_ijk = 0}. Centers are indexed by i = 1, 2, …, I for i ∈ E so that #{i | i ∈ E} = I and i = I + 1, I + 2, …, I + I′ for i ∈ C so that #{i | i ∈ C} = I′. Groups are indexed by j = 1, 2, …, J_i and subjects by k = 1, 2, …, K_ij. Concerning the number of groups within centers, J_i > 1 for i ∈ E, and J_i = 1 for i ∈ C, we assign pseudo group indices so that each center serves as its own single nesting pseudo group in the control arm. Similarly, concerning group sizes (or the number of subjects within groups), K_ij = K_i for i ∈ C. The total number of subjects in E is denoted by $N_{E} = \sum_{i = 1}^{I} \sum_{j = 1}^{I_{j}} K_{i j}$ whereas that in C is denoted by $N_{C} = \sum_{i = I + 1}^{I + I^{'}} K_{i j} = \sum_{i = I + 1}^{I + I^{'}} K_{i}$ .

The fixed-effect parameters β₀ and δ⁽³⁾ represent the intercept and the main effect of the experimental intervention on the outcome Y⁽³⁾, respectively, whereas the center-specific random intercepts are denoted by $u_{i} ~ N (0, σ_{3}^{2})$ . The group-specific random intercepts within the experimental arm centers are denoted by $u_{j (i)} ~ N (0, σ_{2}^{2})$ for j ∈ E. The random noise is denoted by e_ijk which is assumed to be $e_{ijk} ~ N (0, σ_{e}^{2})$ . These three random components are assumed to be mutually independent (mutual independence assumption). However, the elements of e_ijk are assumed to be independent over k for given u_i and u_j, and those of u_j are independent over j for given u_i (conditional independence assumption). Again, we assume that the magnitudes of all variance components in model (9) are known.

It follows that $E (Y_{ijk}^{(3)}) = β_{0} + δ^{(3)} X_{ijk}$ and $Var (Y_{ijk}^{(3)}) = σ_{e}^{2} + σ_{2}^{2} X_{ijk} + σ_{3}^{2}$ . Let $σ^{2} \equiv Var (Y_{ijk}^{(3)} ∣ X_{ijk} = 1) = σ_{e}^{2} + σ_{2}^{2} + σ_{3}^{2}$ denote the variance of Y⁽³⁾ in the experimental arm. Then, the correlations among the group-level observations can be obtained as

ρ_{2} \equiv Corr (Y_{ijk}, Y_{i j^{'} k^{'}}) = σ_{3}^{2} / σ^{2}

(10)

for j ≠ j′ and i ∈ E. The correlations among the subject-level observations can be obtained as

ρ_{1} \equiv Corr (Y_{ijk}, Y_{i j k^{'}}) = (σ_{2}^{2} + σ_{3}^{2}) / σ^{2}

(11)

for k ≠ k′ and i ∈ E. On the other hand, the variance of Y⁽³⁾ in the control arm is denoted by $Var (Y_{ijk}^{(3)} ∣ X_{ijk} = 0) = σ_{e}^{2} + σ_{3}^{2} = (1 - ρ_{1} + ρ_{2}) σ^{2} \leq σ^{2}$ since ρ₂ ≤ ρ₁. The null hypothesis for testing the significance of the main effect of experimental intervention is H₀: δ⁽³⁾ = 0.

3.2 Parameter estimates and variances

An estimate of δ⁽³⁾ under model (9) can be obtained as

{\hat{δ}}^{(3)} = {\bar{Y}}_{E}^{(3)} - {\bar{Y}}_{C}^{(3)}

where

\begin{array}{l} {\bar{Y}}_{E}^{(3)} = \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} \sum_{k = 1}^{K_{i j}} Y_{ijk} / \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} K_{i j} = \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} \sum_{k = 1}^{K_{i j}} Y_{ijk} / N_{E} and \\ {\bar{Y}}_{C}^{(3)} = \sum_{i = I + 1}^{I + I^{'}} \sum_{k = 1}^{K_{i j}} Y_{ijk} / \sum_{i = I + 1}^{I + I^{'}} K_{i} = \sum_{i = I + 1}^{I + I^{'}} \sum_{k = 1}^{K_{i j}} Y_{ijk} / N_{C} \end{array}

are the sample means of Y⁽³⁾ in the experimental and control arms, respectively. The variances of these means can be expressed as

Var ({\bar{Y}}_{E}^{(3)}) = σ^{2} {\frac{1 - ρ_{1}}{N_{E}} + \frac{1}{N_{E}^{2}} (ρ_{1} \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} K_{i j}^{2} + 2 ρ_{2} \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} \sum_{j^{'} > j}^{J_{i}} K_{i j} K_{i j^{'}})}

(12)

and

Var ({\bar{Y}}_{C}^{(3)}) = σ^{2} {(1 - ρ_{1}) / N_{C} + ρ_{2} \sum_{i = I + 1}^{I + I^{'}} K_{i}^{2} / N_{C}^{2}}

(13)

It follows that

\begin{array}{l} Var ({\hat{δ}}_{E}^{(3)}) = Var ({\bar{Y}}_{E}^{(3)}) + Var ({\bar{Y}}_{C}^{(3)}) \\ = σ^{2} {(1 - ρ_{1}) (\frac{1}{N_{E}} + \frac{1}{N_{C}}) + ρ_{2} (\frac{2}{N_{E}^{2}} \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} \sum_{j^{'} > j}^{J_{i}} K_{i j} K_{i j^{'}} + \frac{1}{N_{C}^{2}} \sum_{i = I + 1}^{I + I^{'}} K_{i}^{2}) + \frac{ρ_{1}}{N_{E}^{2}} \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} K_{i j}^{2}} \end{array}

(14)

3.3 Test statistics, power functions and sample size formulae

The following test statistic D⁽³⁾ can be used to test the null hypothesis H₀: δ⁽³⁾ = 0

\begin{array}{l} D^{(3)} = \frac{{\hat{δ}}^{(3)}}{\sqrt{Var ({\hat{δ}}^{(3)})}} \\ = \frac{{\bar{Y}}_{E}^{(3)} - {\bar{Y}}_{C}^{(3)}}{σ \sqrt{(1 - ρ_{1}) (\frac{1}{N_{E}} + \frac{1}{N_{C}}) + ρ_{2} (\frac{2}{N_{E}^{2}} \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} \sum_{j^{'} > j}^{J_{i}} K_{i j} K_{i j^{'}} + \frac{1}{N_{C}^{2}} \sum_{i = I + 1}^{I + I^{'}} K_{i}^{2}) + \frac{ρ_{1}}{N_{E}^{2}} \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} K_{i j}^{2}}} . \end{array}

The power function of D⁽³⁾ at a two-sided significance level of α can be expressed as follow

\begin{array}{l} φ^{(3)} \equiv φ (D^{(3)}) \\ = Φ {Δ_{(3)} / \sqrt{(1 - ρ_{1}) (\frac{1}{N_{E}} + \frac{1}{N_{C}}) + ρ_{2} (\frac{2}{N_{E}^{2}} \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} \sum_{j^{'} > j}^{J_{i}} K_{i j} K_{i j^{'}} + \frac{1}{N_{C}^{2}} \sum_{i = I + 1}^{I + I^{'}} K_{i}^{2}) + \frac{ρ_{1}}{N_{E}^{2}} \sum_{i = 1}^{I} \sum_{j = 1}^{J_{i}} K_{i j}^{2}} - Φ^{- 1} (1 - α / 2)} \end{array}

(15)

where Δ₍₃₎ = |δ⁽³⁾/σ| is again a standardized effect size. The determinations of varying group sizes in the experimental arm and cluster sizes in both arms should be made in an iterative manner.

For special cases where there are equal number of units, sample size determinations are much more tractable. If the sizes of units are equal for both arms, i.e. J_i=J for all i ∈ E and K_j=K for all j ∈ E so that N_E=IJK in the experimental arm and K_i=JK for all i ∈ C and I′ = I so that N_E=N_C=IJK in the control arm (total numbers of subjects are the same between the two arms), then the variances (12), (13) and (14) can be, respectively, reduced to

\begin{array}{l} Var ({\bar{Y}}_{E}^{(3)}) = \frac{σ^{2}}{IJK} {1 + (K - 1) ρ_{1} + K (J - 1) ρ_{2}}, \\ Var ({\bar{Y}}_{C}^{(3)}) = \frac{σ^{2}}{IJK} (1 - ρ_{1} + J K ρ_{2}) \end{array}

and

Var ({\hat{δ}}_{E}^{(3)}) = \frac{σ^{2}}{IJK} {2 + (K - 2) ρ_{1} + K (2 J - 1) ρ_{2}} .

It follows that the power function (15) can be simplified to

φ^{(3)} \equiv φ (D^{(3)}) = Φ {Δ_{(3)} \sqrt{IJK / {2 + (K - 2) ρ_{1} + K (2 J - 1) ρ_{2}}} - Φ^{- 1} (1 - α / 2)}

(16)

Subsequently, the sample size of each level can be determined as follow

I = \frac{{2 + (K - 2) ρ_{1} + K (2 J - 1) ρ_{2}} z_{α, φ}^{2}}{J K Δ_{(3)}^{2}}

(17)

J = \frac{{2 + (K - 2) ρ_{1} - K ρ_{2}} z_{α, φ}^{2}}{I K Δ_{(3)}^{2} - 2 K ρ_{2} z_{α, φ}^{2}}

(18)

for $ρ_{2} < I K Δ_{(3)}^{2} / 2 K z_{α, φ}^{2}$ , and

K = \frac{2 (1 - ρ_{1}) z_{α, φ}^{2}}{I J Δ_{(3)}^{2} - {ρ_{1} + (2 J - 1) ρ_{2}} z_{α, φ}^{2}}

(19)

for combinations ρ₁ (11) and ρ₂ (10) that makes the denominator of equation (19) positive, where z_α_,_φ is defined as in (8). Again, we note that determinations of J (18) with given I and K or determinations of K (19) with given I and J may not be possible for desired power with certain combinations of Δ₍₃₎, ρ₁, ρ₂, and IK or IJ, respectively. For example, ρ₂ in particular must be very small for small Δ₍₃₎, I and J in equation (19).

4 Simulation studies and results

We conducted simulation studies to validate the derived sample size formulae for both two-level models (1) and three-level models (9). We used SAS PROC MIXED to fit those models with unknown variances which are usually assumed in practice, and to compute simulation-based empirical power. This computation was based on critical values of t-distributions under the null hypotheses with degrees of freedom determined based on the method proposed by Kenward and Roger.¹⁸ Throughout this section, a nominal statistical power is set at 80%, a two-sided significance level is set at α=0.05, and 1000 simulations were conducted for each combination of design parameters. Therefore, the 95% confidence intervals (CI) for the empirical power will be 0.8±0.025.

4.1 Two-level model

4.1.1 Equal group sizes

We first determined J, the number of groups in the experimental arm in equation (6), for 80% statistical power with other given parameters including group size K, the number of subjects per group. Accordingly the sample size, or the number of subjects, the control arm is determined as N_C=JK. We then computed empirical power, denoted by φ̃⁽²⁾, by fitting model (1) based on 1000 simulations for each combination of the specified parameters in Table 1. The results in Table 1 show that the theoretical power φ⁽²⁾ (5) and the simulation-based empirical power φ̃⁽²⁾ is very close with mean (φ⁽²⁾) − mean(φ̃⁽²⁾)=0.014 (or 1.8% bias) and max|φ⁽²⁾ − φ̃⁽²⁾|=0.034, which is tolerable compared to the 0.025 margin of the 95% CI. This finding supports the validity of the power function φ⁽²⁾ (5) and sample size formulae for J in equation (6).

Table 1.

Comparison of theoretical power and simulation-based empirical power when group sizes are equal in a two-level model: Determinations of J with given K.

Δ₍₂₎	ρ	Experimental arm: N_E = JK		Control arm: N_C = J′	N_E + N_C	φ⁽²⁾	φ̃⁽²⁾
Δ₍₂₎	ρ	J	K	J′ = JK	N_E + N_C	φ⁽²⁾	φ̃⁽²⁾
0.4	0.2	18	10	180	360	0.807	0.802
	0.4	26	10	260	520	0.807	0.805
	0.6	34	10	340	680	0.807	0.796
	0.2	26	5	130	260	0.807	0.807
	0.4	32	5	160	320	0.807	0.813
	0.6	38	5	190	380	0.807	0.818
0.5	0.2	12	10	120	240	0.823	0.798
	0.4	17	10	170	340	0.816	0.793
	0.6	22	10	220	440	0.812	0.794
	0.2	17	5	85	170	0.816	0.810
	0.4	21	5	105	210	0.817	0.796
	0.6	24	5	120	240	0.802	0.790
0.6	0.2	8	10	80	160	0.807	0.786
	0.4	12	10	120	240	0.822	0.792
	0.6	15	10	150	300	0.805	0.786
	0.2	12	5	60	120	0.822	0.800
	0.4	14	5	70	140	0.801	0.780
	0.6	17	5	85	170	0.810	0.776
Mean						0.811	0.797

Open in a new tab

Note: Δ₍₂₎ is a standardized effects size for two-level models; ρ (2) is the intra-class correlation coefficient of outcome Y within groups in the experimental arm; J is the number of groups in the experimental arm determined based on equation (6); K is the given number of subjects per group; φ⁽²⁾ is the theoretical power based on equation (5); and φ̃⁽²⁾ is the empirical power estimated based on 1000 simulations for each combination of design parameters.

Second, we determined K, the number of subjects per group in the experimental arm in equation (7), for 80% statistical power with other given parameters including group size J. In this case, we considered small J to evaluate the power function under this condition. The other parameters are specified in Table 2, and especially the correlation ρ(2) had to be extremely small to ensure a positive K in equation (7) as noted earlier. The results in Table 2 show that φ⁽²⁾ and φ̃⁽²⁾ are still close with mean (φ⁽²⁾) − mean (φ̃⁽²⁾) = 0.054 (or 6.8% bias) and max|φ⁽²⁾ − φ̃⁽²⁾|=0.075, which is large compared to the 0.025 margin of the 95% CI. Furthermore, the theoretical power φ⁽²⁾ is underestimated φ̃⁽²⁾ for all combinations, especially for very large K compared to J. This finding cautions the use of the power function (5) for determining K in equation (7) with small J and very small ρ.

Table 2.

Comparison of theoretical power and simulation-based empirical power when group sizes are equal in a two-level model: Determinations of K with given J.

Δ₍₂₎	ρ	Experimental arm: N_E = JK		Control arm: N_C = J′	N_E + N_C	φ⁽²⁾	φ̃⁽²⁾
Δ₍₂₎	ρ	J	K	J′ = JK	N_E + N_C	φ⁽²⁾	φ̃⁽²⁾
0.4	0.025	5	26	130	260	0.807	0.776
	0.050	5	37	185	370	0.802	0.749
	0.075	5	69	345	690	0.800	0.725
0.5	0.025	5	15	75	150	0.811	0.790
	0.050	5	18	90	180	0.809	0.767
	0.075	5	22	110	220	0.800	0.742
0.6	0.025	5	10	50	100	0.816	0.767
	0.050	5	11	55	110	0.811	0.786
	0.075	5	12	60	120	0.800	0.764
Mean						0.806	0.763

Open in a new tab

Note: Δ₍₂₎ is a standardized effects size for two-level models; ρ (2) is the intra-class correlation coefficient of outcome Y within groups in the experimental arm; J is the given number of groups in the experimental arm; K is the number of subjects per group determined based on equation (7); φ⁽²⁾ is the theoretical power based on equation (5); and φ̃⁽²⁾ is the empirical power estimated based on 1000 simulations for each combination of design parameters.

4.1.2 Unequal group sizes

Although the power function (4) should be applied to the cases of unequal group sizes, sample size determinations require iterative solutions as mentioned earlier. Therefore, we assessed approximate applicability of the sample size J (6) for equal unit sizes to the case of unequal group sizes. To this end, we determined J for a given equal group size K and then considered a uniform random variable U(a, b) with expectation K to randomly draw varying group sizes K_j where the integer values of a and b were determined as follows: a=K−floor(K/2) and b=K+floor(K/2) so that a>0 and E{U(a, b)}=(a+b)/2=K, where the function floor(x) returns the greatest integer smaller than or equal to x. For the control arm, however, we fixed sample size for practical reasons at N_C=JK which is equal to average sample size for the experimental arm.

The theoretical power φ⁽²⁾ was based on the mean group sizes for the experimental group and the fixed sample size for the control group; however, the empirical power φ̃⁽²⁾ was based on the randomly drawn unequal group sizes in the experimental arm. The results in Table 3 again show that the theoretical power and the simulation-based empirical power is very close with mean(φ⁽²⁾) − mean(φ̃⁽²⁾)=0.026 (or 3.3% bias) and max|φ⁽²⁾ − φ̃⁽²⁾|=0.058, which is somewhat large compared to the 0.025 margin of the 95% CI. Therefore, when group sizes in the experimental arm need to vary, the mean group sizes can be used as K to determine J in equation (6). Sample size for the control group can be determined accordingly as JK.

Table 3.

Comparison of theoretical power and simulation-based empirical power when group sizes vary in a two-level model with varying sizes of K.

Δ₍₂₎	ρ	Experimental arm: N_E = J × mean(K_j)		Control arm: N_C = J′	N_E + N_C	φ⁽²⁾	φ̃⁽²⁾
Δ₍₂₎	ρ	J	K_j	J′	N_E + N_C	φ⁽²⁾	φ̃⁽²⁾
0.4	0.2	18	U(5,15)	180	360	0.807	0.780
	0.4	26	U(5,15)	260	520	0.807	0.775
	0.6	34	U(5,15)	340	680	0.807	0.786
	0.2	26	U(3,7)	130	260	0.807	0.808
	0.4	32	U(3,7)	160	320	0.807	0.782
	0.6	38	U(3,7)	190	380	0.807	0.775
0.5	0.2	12	U(5,15)	120	240	0.823	0.809
	0.4	17	U(5,15)	170	340	0.816	0.809
	0.6	22	U(5,15)	220	440	0.812	0.789
	0.2	17	U(3,7)	85	170	0.816	0.825
	0.4	21	U(3,7)	105	210	0.817	0.798
	0.6	24	U(3,7)	120	240	0.802	0.779
0.6	0.2	8	U(5,15)	80	160	0.807	0.773
	0.4	12	U(5,15)	120	240	0.822	0.763
	0.6	15	U(5,15)	150	300	0.805	0.760
	0.2	12	U(3,7)	60	120	0.822	0.803
	0.4	14	U(3,7)	70	140	0.801	0.767
	0.6	17	U(3,7)	85	170	0.810	0.752
Mean						0.811	0.785

Open in a new tab

Note: U(a, b) denotes a uniform distribution with minimum a and maximum b; Δ₍₂₎ is a standardized effects size for two-level models; ρ (2) is the intra-class correlation coefficient of outcome Y within groups in the experimental arm; J is the number of groups in the experimental arm determined based on equation (6) for given mean(K_j); φ⁽²⁾ is the theoretical power based on equation (5) with K replaced by mean(K_j); and φ̃⁽²⁾ is the empirical power estimated based on 1000 simulations for each combination of design parameters.

4.2 Three-level model

4.2.1 Equal unit sizes

We first determined I, the number of centers in the experimental arm in equation (17), for 80% statistical power with other given parameters including number of groups J per center and number of subjects K per group. The sample size, or the number of subjects, per center for the control arm is determined as JK so that N_E=N_C=IJK. We then computed empirical power, denoted by φ̃⁽³⁾, by fitting model (9) based on 1000 simulations for each combination of the specified parameters in Table 4. The results in Table 4 show that the theoretical power φ⁽³⁾ (16) and the simulation-based empirical power φ̃⁽³⁾ is very close with mean(φ⁽³⁾) − mean(φ̃⁽³⁾)=0.006 (0.8% bias) and max|φ⁽³⁾ − φ̃⁽³⁾|=0.027, which is excellent compared to the 0.025 margin of the 95% CI. This finding supports the validity of the power function φ⁽³⁾ (16) and sample size formulae for I in equation (17).

Table 4.

Comparison of theoretical power and simulation-based empirical power when both center and group sizes are equal in a three-level model: Determinations of K with given I and J.

Δ₍₃₎	ρ₂	ρ₁	Experimental arm: N_E = IJK			Control arm: N_C = I′JK		N_E + N_C	φ⁽³⁾	φ̃⁽³⁾
Δ₍₃₎	ρ₂	ρ₁	I	J	K	I′ = I	JK	N_E + N_C	φ⁽³⁾	φ̃⁽³⁾
0.4	0.1	0.4	14	5	10	14	50	1400	0.802	0.809
		0.6	16	5	10	16	50	1600	0.812	0.830
	0.2	0.4	23	5	10	23	50	2300	0.804	0.787
		0.6	25	5	10	25	50	2500	0.811	0.810
	0.1	0.4	13	10	5	13	50	1300	0.816	0.829
		0.6	14	10	5	14	50	1400	0.827	0.819
	0.2	0.4	22	10	5	22	50	2200	0.804	0.807
		0.6	23	10	5	23	50	2300	0.811	0.802
0.5	0.1	0.4	9	5	10	9	50	900	0.804	0.784
		0.6	10	5	10	10	50	1000	0.803	0.776
	0.2	0.4	15	5	10	15	50	1500	0.811	0.805
		0.6	16	5	10	16	50	1600	0.811	0.807
	0.1	0.4	8	10	5	8	50	800	0.801	0.781
		0.6	9	10	5	9	50	900	0.829	0.839
	0.2	0.4	14	10	5	14	50	1400	0.802	0.795
		0.6	15	10	5	15	50	1500	0.818	0.826
0.6	0.1	0.4	7	5	10	7	50	700	0.846	0.834
		0.6	7	5	10	7	50	700	0.806	0.807
	0.2	0.4	11	5	10	11	50	1100	0.832	0.824
		0.6	11	5	10	11	50	1100	0.807	0.796
	0.1	0.4	6	10	5	6	50	600	0.831	0.811
		0.6	6	10	5	6	50	600	0.813	0.805
	0.2	0.4	10	10	5	10	50	1000	0.813	0.795
		0.6	10	10	5	10	50	1000	0.802	0.789
Mean									0.813	0.807

Open in a new tab

Note: Δ₍₃₎ is a standardized effects size for three-level models; ρ₁ (11) is the correlation among the subject-level observations of outcome Y in the experimental arm; ρ₂ (10) is the correlation among the group-level observations of outcome of outcome Y in the experimental arm; I is the number of centers in the experimental arm determined based on equation (17); J is the given number of groups per center; K is the given number of subjects per group; and φ⁽³⁾ is the theoretical power based on equation (16); φ̃⁽³⁾ is the empirical power estimated based on 1000 simulations for each combination of design parameters.

Second, we determined K, the number of subjects per group in the experimental arm in equation (19), for 80% statistical power with other given parameters including I and J. In this case, again, we considered small I and J to evaluate the power function under this condition. The other parameters are specified in Table 5, and especially the correlation ρ₂ (10) had to be extremely small to ensure a positive K in equation (19) as noted earlier. The results in Table 5 show that φ⁽³⁾ and φ̃⁽³⁾ are close with mean (φ⁽³⁾)−mean(φ̃⁽³⁾)=0.035 (or 4.4% bias) and max|φ⁽³⁾−φ̃⁽³⁾|=0.050, which is somewhat large compared to the 0.025 margin of the 95% CI. Compared to the case of small J under two-level models, however, the differences between φ⁽³⁾ and φ̃⁽³⁾ for all combinations are smaller and tolerable despite very small ρ₂. This finding supports the use of the power function (16) even for small I and J with very small ρ₂ in determination of K in equation (19).

Table 5.

Comparison of theoretical power and simulation-based empirical power when both center and group sizes are equal in a three-level model: Determinations of I with given J and K.

Δ₍₃₎	ρ₂	ρ₁	Experimental arm: N_E = IJK			Control arm: N_C = I′JK		N_E + N_C	φ⁽³⁾	φ̃⁽³⁾
Δ₍₃₎	ρ₂	ρ₁	I	J	K	I′ = I	JK	N_E + N_C	φ⁽³⁾	φ̃⁽³⁾
0.4	0.01	0.1	5	5	6	5	30	300	0.815	0.787
		0.2	5	5	8	5	40	400	0.815	0.784
	0.02	0.1	5	5	8	5	40	400	0.804	0.800
		0.2	5	5	13	5	65	650	0.805	0.779
0.5	0.01	0.1	5	5	3	5	15	150	0.803	0.793
		0.2	5	5	4	5	20	200	0.853	0.815
	0.02	0.1	5	5	4	5	20	200	0.833	0.801
		0.2	5	5	4	5	20	200	0.808	0.787
0.6	0.01	0.1	5	5	2	5	10	100	0.820	0.777
		0.2	5	5	2	5	10	100	0.820	0.770
	0.02	0.1	5	5	3	5	15	150	0.892	0.855
		0.2	5	5	3	5	15	150	0.881	0.864
Mean									0.813	0.807

Open in a new tab

Note: Δ₍₃₎ is a standardized effects size for three-level models; ρ₁ (11) is the correlation among the subject-level observations of outcome Y in the experimental arm; ρ₂ (10) is the correlation among the group-level observations of outcome of outcome Y in the experimental arm; I is the given number of centers in the experimental arm; J is the given number of groups per center; K is the number of subjects per group determined based on equation (19); φ⁽³⁾ is the theoretical power based on equation (16); and φ̃⁽³⁾ is the empirical power estimated based on 1000 simulations for each combination of design parameters.

4.2.2 Unequal unit sizes

Again, although the power function (15) should be applied to the cases of unequal unit sizes, sample size determinations require iterative solutions as mentioned earlier. Therefore, we assessed again approximate applicability of the sample size I (17) for equal unit sizes to the case of unequal unit sizes. To this end, we determined I for an equal number of groups J and an equal group size K and then randomly drew varying number of groups J_i from a uniform distribution U(a, b) with a=J−floor(J/2) and b=J+floor(J/2) so that a>0 and E{U(a, b)}=J. We further varied the group size K_ij drawing uniform distribution U(a, b) with a=K−floor(K/2) and b=K+floor(K/2) so that a>0 and E{U(a, b)}=K. Likewise, for the control arm, we varied center size, or number of subjects per center, K_i by randomly drawing from a uniform distribution U(a, b) with a=JK−floor(JK/2) and b=JK+floor(JK/2) so that a>0 and E{U(a, b)}=JK.

The theoretical power φ⁽³⁾ was based on the mean unit sizes for the experimental group and the mean center size for the control group; however, the empirical power φ̃⁽³⁾ was based on the randomly drawn unequal unit sizes. The results in Table 6 show that the theoretical power and the simulation-based empirical power is very close with mean(φ⁽³⁾)−mean(φ̃⁽³⁾)=0.014 (1.8% bias) and max|φ⁽³⁾−φ⁽³⁾|=0.030, which is acceptable compared to the 0.025 margin of the 95% CI. Therefore, when group sizes need to be varied, the mean center sizes and group sizes can be used as J and K, respectively, to determine I in equation (17). Likewise, the center sizes for the control arm can vary with mean JK.

Table 6.

Comparison of theoretical power and simulation-based empirical power when both center and group sizes vary in a three-level model: Determinations of I with varying sizes of J and K.

Δ₍₃₎	ρ₂	ρ₁	Experimental arm: N_E = I × mean(J_i) × mean(K_ij)			Control arm: N_C = I′ × mean(K_i)		N_E + N_C	φ⁽³⁾	φ̃⁽³⁾
Δ₍₃₎	ρ₂	ρ₁	I	J_i	K_ij	I′ = I	K_i	N_E + N_C	φ⁽³⁾	φ̃⁽³⁾
0.4	0.1	0.4	14	U(3,7)	U(5,15)	14	U(25,75)	1400	0.802	0.791
		0.6	16	U(3,7)	U(5,15)	16	U(25,75)	1600	0.812	0.821
	0.2	0.4	23	U(3,7)	U(5,15)	23	U(25,75)	2300	0.804	0.786
		0.6	25	U(3,7)	U(5,15)	25	U(25,75)	2500	0.811	0.794
	0.1	0.4	13	U(5,15)	U(3,7)	13	U(25,75)	1300	0.816	0.795
		0.6	14	U(5,15)	U(3,7)	14	U(25,75)	1400	0.827	0.802
	0.2	0.4	22	U(5,15)	U(3,7)	22	U(25,75)	2200	0.804	0.803
		0.6	23	U(5,15)	U(3,7)	23	U(25,75)	2300	0.811	0.794
0.5	0.1	0.4	9	U(3,7)	U(5,15)	9	U(25,75)	900	0.804	0.781
		0.6	10	U(3,7)	U(5,15)	10	U(25,75)	1000	0.803	0.802
	0.2	0.4	15	U(3,7)	U(5,15)	15	U(25,75)	1500	0.811	0.802
		0.6	16	U(3,7)	U(5,15)	16	U(25,75)	1600	0.811	0.801
	0.1	0.4	8	U(5,15)	U(3,7)	8	U(25,75)	800	0.801	0.783
		0.6	9	U(5,15)	U(3,7)	9	U(25,75)	900	0.829	0.811
	0.2	0.4	14	U(5,15)	U(3,7)	14	U(25,75)	1400	0.802	0.790
		0.6	15	U(5,15)	U(3,7)	15	U(25,75)	1500	0.818	0.812
0.6	0.1	0.4	7	U(3,7)	U(5,15)	7	U(25,75)	700	0.846	0.831
		0.6	7	U(3,7)	U(5,15)	7	U(25,75)	700	0.806	0.814
	0.2	0.4	11	U(3,7)	U(5,15)	11	U(25,75)	1100	0.832	0.808
		0.6	11	U(3,7)	U(5,15)	11	U(25,75)	1100	0.807	0.783
	0.1	0.4	6	U(5,15)	U(3,7)	6	U(25,75)	600	0.831	0.828
		0.6	6	U(5,15)	U(3,7)	6	U(25,75)	600	0.813	0.787
	0.2	0.4	10	U(5,15)	U(3,7)	10	U(25,75)	1000	0.813	0.783
		0.6	10	U(5,15)	U(3,7)	10	U(25,75)	1000	0.802	0.786
Mean									0.813	0.800

Open in a new tab

Note: U(a, b) denotes a uniform distribution with minimum a and maximum b; Δ₍₃₎ is a standardized effects size for three-level models; ρ₁ (11) is the correlation among the subject-level observations of outcome Y in the experimental arm; ρ₂ (10) is the correlation among the group-level observations of outcome of outcome Y in the experimental arm; I is the number of centers in the experimental arm determined based on equation (17) for the given mean of J_i and the given mean of K_ij; φ⁽³⁾ is the theoretical power based on equation (16) with J and K replaced by mean(J_i) and mean(K_ij), respectively; and φ̃⁽³⁾ is the empirical power estimated based on 1000 simulations for each combination of design parameters.

5 Discussion

We derived and validated sample size formulae for detecting main effects for trials with different levels of data hierarchy between arms with varying sizes of units of each nested level. In fact, it can be shown that the sample size J (6) is smaller than that for a trial with two levels of data hierarchy for both arms.¹⁹ Likewise, the sample size I (17) is smaller than that for a trial with three levels of data hierarchy for both arms.^20,21 Through all Tables from 1 to 6, the mean theoretical power was greater than the mean empirical power (see the bottom rows). This discrepancy was due to the fact that the theoretical power was derived under known variance components, using standard normal distributions, whereas the empirical power was estimated under unknown variance components using t-distributions. The minimal nature of the discrepancies, however, assures that the theoretical power can be used for models with unknown variance components that have to be replaced by estimates for fitting regardless of equal or varying sizes of nesting units. Nevertheless, cautions should be exercised for determining K with small J and very small ρ under two-level partially clustered designs.

These derivations have great relevance to clinical research, since the Affordable Care Act and various health care system stakeholders encourage the development and evaluation of innovative delivery models such as group models of care.²² In addition, as new study designs emerge to expand the field of patient centered outcomes research, it is imperative to evaluate the implications of design innovations and to ensure that design methodology remains rigorous. Studies in which there are systematic, planned differences between comparison groups are likely to be more frequent, and it is crucially important to understand how such differences affect the interpretation of results. Therefore, development of rigorous statistical methods will be needed in order to establish that group-based models of care improve clinical outcomes compared to individual-based models of care. To this end, for future studies, two potential extensions of the sample size determination approaches proposed here would be valuable: (1) extension to designs that implement both group-based experimental and individual-based control arms within centers; and (2) extension to designs with binary outcomes and other types of outcomes.

Acknowledgments

We are grateful to two anonymous reviewers for their corrections and suggestions for improving the contents of this paper.

Funding

This study was supported in part by the following NIH grants: R01DA034086, R25DA023021, K23MH102129, P30AI051519, and UL1TR001073.

Footnotes

Conflict of interest

None declared.

References

1.Jaber R, Braksmajer A, Trilling JS. Group visits: A qualitative review of current research. J Am Board Family Med. 2006;19:276–290. doi: 10.3122/jabfm.19.3.276. [DOI] [PubMed] [Google Scholar]
2.Drum D, Becker M, Hess E. Expanding the application of group interventions: emergence of groups in health care settings. J Specialist Group Work. 2011;36:247–263. [Google Scholar]
3.McCarthy C, Hart S. Designing groups to meet evolving challenges in health care settings. J Specialist Group Work. 2011;36:352–367. [Google Scholar]
4.Wagner EH, Grothaus LC, Sandhu N, et al. Chronic care clinics for diabetes in primary care – A system-wide randomized trial. Diab Care. 2001;24:695–700. doi: 10.2337/diacare.24.4.695. [DOI] [PubMed] [Google Scholar]
5.Sadur CN, Moline N, Costa M, et al. Diabetes management in a health maintenance organization – Efficacy of care management using cluster visits. Diabetes Care. 1999;22:2011–2017. doi: 10.2337/diacare.22.12.2011. [DOI] [PubMed] [Google Scholar]
6.Burke RE, O’Grady ET. Group visits hold great potential for improving diabetes care and outcomes, but best practices must be developed. Health Affairs. 2012;31:103–109. doi: 10.1377/hlthaff.2011.0913. [DOI] [PubMed] [Google Scholar]
7.Stein MR, Soloway IJ, Jefferson KS, et al. Concurrent group treatment for hepatitis C: Implementation and outcomes in a methadone maintenance treatment program. J Substance Abuse Treatment. 2012;43:424–432. doi: 10.1016/j.jsat.2012.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Moadel AB, Bernstein SL, Mermelstein RJ, et al. A randomized controlled trial of a tailored group smoking cessation intervention for HIV-infected smokers. JAIDS. 2012;61:208–215. doi: 10.1097/QAI.0b013e3182645679. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Ickovics JR, Kershaw TS, Westdahl C, et al. Group prenatal care and perinatal outcomes: A Randomized controlled trial. Obstet Gynecol. 2007;110:937–937. doi: 10.1097/01.AOG.0000275284.24298.23. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Scott JC, Conner DA, Venohr I, et al. Effectiveness of a group outpatient visit model for chronically ill older health maintenance organization members: A 2-year randomized trial of the cooperative health care clinic. J Am Geriatrics Soc. 2004;52:1463–1470. doi: 10.1111/j.1532-5415.2004.52408.x. [DOI] [PubMed] [Google Scholar]
11.Freitag CM, Cholemkery H, Elsuni L, et al. The group-based social skills training SOSTA-FRA in children and adolescents with high functioning autism spectrum disorder – study protocol of the randomised, multi-centre controlled SOSTA – net trial. Trials. 2013;14:12. doi: 10.1186/1745-6215-14-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Barlow J, Smailagic N, Huband N, et al. Group-based parent training programmes for improving parental psychosocial health. Cochrane Database Systematic Rev. 2012:200. doi: 10.1002/14651858.CD002020.pub3. [DOI] [PubMed] [Google Scholar]
13.Roberts C, Roberts SA. Design and analysis of clinical trials with clustering effects due to treatment. Clin Trials. 2005;2:152–162. doi: 10.1191/1740774505cn076oa. [DOI] [PubMed] [Google Scholar]
14.Bauer DJ, Sterba SK, Hallfors DD. Evaluating group-based interventions when control participants are ungrouped. Multivariate Behavior Res. 2008;43:210–236. doi: 10.1080/00273170802034810. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Baldwin SA, Bauer DJ, Stice E, et al. Evaluating models for partially clustered designs. Psychol Meth. 2011;16:149–165. doi: 10.1037/a0023464. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Moerbeek M, Wong WK. Sample size formulae for trials comparing group and individual treatments in a multilevel model. Stat Med. 2008;27:2850–2864. doi: 10.1002/sim.3115. [DOI] [PubMed] [Google Scholar]
17.Cohen J. Statistical power analysis for the behavioral science. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988. [Google Scholar]
18.Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53:983–997. [PubMed] [Google Scholar]
19.Diggle PJ, Heagerty P, Linag K-Y, et al. Analysis of longitudinal data. 2. New York: Oxford University Press; 2002. [Google Scholar]
20.Heo M, Leon AC. Statistical power and sample size requirements for three level hierarchical cluster randomized trials. Biometrics. 2008;64:1256–1262. doi: 10.1111/j.1541-0420.2008.00993.x. [DOI] [PubMed] [Google Scholar]
21.Teerenstra S, Moerbeek M, van Achterberg T, et al. Sample size calculations for 3-level cluster randomized trials. Clinical Trials. 2008;5:486–495. doi: 10.1177/1740774508096476. [DOI] [PubMed] [Google Scholar]
22.Davis K, Abrams M, Stremikis K. How the affordable care act will strengthen the nation’s primary care foundation. J Gen Intern Med. 2011;26:1201–1203. doi: 10.1007/s11606-011-1720-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Jaber R, Braksmajer A, Trilling JS. Group visits: A qualitative review of current research. J Am Board Family Med. 2006;19:276–290. doi: 10.3122/jabfm.19.3.276. [DOI] [PubMed] [Google Scholar]

[R2] 2.Drum D, Becker M, Hess E. Expanding the application of group interventions: emergence of groups in health care settings. J Specialist Group Work. 2011;36:247–263. [Google Scholar]

[R3] 3.McCarthy C, Hart S. Designing groups to meet evolving challenges in health care settings. J Specialist Group Work. 2011;36:352–367. [Google Scholar]

[R4] 4.Wagner EH, Grothaus LC, Sandhu N, et al. Chronic care clinics for diabetes in primary care – A system-wide randomized trial. Diab Care. 2001;24:695–700. doi: 10.2337/diacare.24.4.695. [DOI] [PubMed] [Google Scholar]

[R5] 5.Sadur CN, Moline N, Costa M, et al. Diabetes management in a health maintenance organization – Efficacy of care management using cluster visits. Diabetes Care. 1999;22:2011–2017. doi: 10.2337/diacare.22.12.2011. [DOI] [PubMed] [Google Scholar]

[R6] 6.Burke RE, O’Grady ET. Group visits hold great potential for improving diabetes care and outcomes, but best practices must be developed. Health Affairs. 2012;31:103–109. doi: 10.1377/hlthaff.2011.0913. [DOI] [PubMed] [Google Scholar]

[R7] 7.Stein MR, Soloway IJ, Jefferson KS, et al. Concurrent group treatment for hepatitis C: Implementation and outcomes in a methadone maintenance treatment program. J Substance Abuse Treatment. 2012;43:424–432. doi: 10.1016/j.jsat.2012.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Moadel AB, Bernstein SL, Mermelstein RJ, et al. A randomized controlled trial of a tailored group smoking cessation intervention for HIV-infected smokers. JAIDS. 2012;61:208–215. doi: 10.1097/QAI.0b013e3182645679. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Ickovics JR, Kershaw TS, Westdahl C, et al. Group prenatal care and perinatal outcomes: A Randomized controlled trial. Obstet Gynecol. 2007;110:937–937. doi: 10.1097/01.AOG.0000275284.24298.23. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Scott JC, Conner DA, Venohr I, et al. Effectiveness of a group outpatient visit model for chronically ill older health maintenance organization members: A 2-year randomized trial of the cooperative health care clinic. J Am Geriatrics Soc. 2004;52:1463–1470. doi: 10.1111/j.1532-5415.2004.52408.x. [DOI] [PubMed] [Google Scholar]

[R11] 11.Freitag CM, Cholemkery H, Elsuni L, et al. The group-based social skills training SOSTA-FRA in children and adolescents with high functioning autism spectrum disorder – study protocol of the randomised, multi-centre controlled SOSTA – net trial. Trials. 2013;14:12. doi: 10.1186/1745-6215-14-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Barlow J, Smailagic N, Huband N, et al. Group-based parent training programmes for improving parental psychosocial health. Cochrane Database Systematic Rev. 2012:200. doi: 10.1002/14651858.CD002020.pub3. [DOI] [PubMed] [Google Scholar]

[R13] 13.Roberts C, Roberts SA. Design and analysis of clinical trials with clustering effects due to treatment. Clin Trials. 2005;2:152–162. doi: 10.1191/1740774505cn076oa. [DOI] [PubMed] [Google Scholar]

[R14] 14.Bauer DJ, Sterba SK, Hallfors DD. Evaluating group-based interventions when control participants are ungrouped. Multivariate Behavior Res. 2008;43:210–236. doi: 10.1080/00273170802034810. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Baldwin SA, Bauer DJ, Stice E, et al. Evaluating models for partially clustered designs. Psychol Meth. 2011;16:149–165. doi: 10.1037/a0023464. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Moerbeek M, Wong WK. Sample size formulae for trials comparing group and individual treatments in a multilevel model. Stat Med. 2008;27:2850–2864. doi: 10.1002/sim.3115. [DOI] [PubMed] [Google Scholar]

[R17] 17.Cohen J. Statistical power analysis for the behavioral science. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988. [Google Scholar]

[R18] 18.Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53:983–997. [PubMed] [Google Scholar]

[R19] 19.Diggle PJ, Heagerty P, Linag K-Y, et al. Analysis of longitudinal data. 2. New York: Oxford University Press; 2002. [Google Scholar]

[R20] 20.Heo M, Leon AC. Statistical power and sample size requirements for three level hierarchical cluster randomized trials. Biometrics. 2008;64:1256–1262. doi: 10.1111/j.1541-0420.2008.00993.x. [DOI] [PubMed] [Google Scholar]

[R21] 21.Teerenstra S, Moerbeek M, van Achterberg T, et al. Sample size calculations for 3-level cluster randomized trials. Clinical Trials. 2008;5:486–495. doi: 10.1177/1740774508096476. [DOI] [PubMed] [Google Scholar]

[R22] 22.Davis K, Abrams M, Stremikis K. How the affordable care act will strengthen the nation’s primary care foundation. J Gen Intern Med. 2011;26:1201–1203. doi: 10.1007/s11606-011-1720-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Sample size determinations for group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms

Moonseong Heo

Alain H Litwin

Oni Blackstock

Namhee Kim

Julia H Arnsten

Abstract

1 Introduction

2 Two-level model

2.1 Statistical model

2.2 Parameter estimates and variances

2.3 Test statistics, power functions and sample size formulae

3 Three-level model

3.1 Statistical model

3.2 Parameter estimates and variances

3.3 Test statistics, power functions and sample size formulae

4 Simulation studies and results

4.1 Two-level model

4.1.1 Equal group sizes

Table 1.

Table 2.

4.1.2 Unequal group sizes

Table 3.

4.2 Three-level model

4.2.1 Equal unit sizes

Table 4.

Table 5.

4.2.2 Unequal unit sizes

Table 6.

5 Discussion

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Sample size determinations for group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms

Moonseong Heo

Alain H Litwin

Oni Blackstock

Namhee Kim

Julia H Arnsten

Abstract

1 Introduction

2 Two-level model

2.1 Statistical model

2.2 Parameter estimates and variances

2.3 Test statistics, power functions and sample size formulae

3 Three-level model

3.1 Statistical model

3.2 Parameter estimates and variances

3.3 Test statistics, power functions and sample size formulae

4 Simulation studies and results

4.1 Two-level model

4.1.1 Equal group sizes

Table 1.

Table 2.

4.1.2 Unequal group sizes

Table 3.

4.2 Three-level model

4.2.1 Equal unit sizes

Table 4.

Table 5.

4.2.2 Unequal unit sizes

Table 6.

5 Discussion

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases