Skip to main content
. Author manuscript; available in PMC: 2022 Feb 16.
Published in final edited form as: Clin Trials. 2021 Nov 29;19(1):86–96. doi: 10.1177/17407745211046672

Table 1.

Potential sources of clustering and consequences of ignoring the intracluster correlation in the sample size calculation and analysis

Type of clustering Explanation Implications for calculated sample size when correlation is ignoreda Implications for analysis when correlation is ignoreda
Cluster randomized trial Observations from multiple individuals belonging to the same cluster are usually positively correlated Too small Increased risk of Type I error
Individually randomized group treatment trial Multiple observations from individuals receiving treatment in the same group or by the same interventionist are usually positively correlated Too small Increased risk of Type I error
Individually randomized parallel arm trial with repeated measures on the same individual after intervention (treatment is between-subject effect) Multiple repeated measures on the same individual are usually positively correlated Too smallb Increased risk of Type I errorb
Individually randomized cross-over trial (treatment is within-subject effect) Multiple repeated measures on the same individual are usually positively correlated Too largeb Increased risk of Type II errorb
Individually randomized parallel arm trial with repeated measures on the same individual before and after intervention Multiple repeated measures on the same individual are usually positively correlated Either too large or too smallc Increased risk of Type I or Type II errorc
Dyadic outcome with both members of the pair allocated to the same intervention (treatment is a between-dyads effect) Measurements on two individuals in a dyad (e.g., patient-caregiver) may be positively or negatively correlated Too small if correlation is positived
Too large if correlation is negatived
Increased risk of Type I error if correlation is positived
Increased risk of Type II error if correlation is negatived
Multivariate or co-primary endpoints when the trial is designed to evaluate a joint effect on all the endpoints Multiple components of the multivariate outcome are usually positively correlated. Power decreases as the number of endpoints being evaluated increases. To maintain nominal power, the sample size should be increased. Accounting for the correlation can lessen the increase in the sample size. Too large Increased risk of Type II error
a

Note: we consider here a superiority trial design with a continuous or binary endpoint and a single source of clustering; we further consider an analysis involving all measurements but ignoring the intracluster correlations.

b

Assumes all measurements are analyzed, ignoring the correlation. Failing to utilize available repeated measures (i.e., basing the sample size calculation or analysis on a single measurement per subject) has the opposite effect: it means that the sample size may be larger than required and the analysis may be statistically inefficient.

c

Depending on the number of repeated measurements and the strength of the within-subject correlation.

d

Assumes all measurements are analyzed, ignoring the correlation. Failing to utilize the available pairwise measurements (i.e., basing the sample size calculation or analysis on a single member of the dyad when the observations are positively correlated) has the opposite effect: the sample size may be larger than required and the analysis may be statistically inefficient..