Table 1.
Parameter | Description |
---|---|
Population size | • Total population size from which the sample will be drawn and about which researchers will make conclusions. • If the target population is small (less than 10 times the sample size) then a finite population correction may be required. |
Expected prevalence | • Information regarding expected prevalence should be obtained from the literature, from expert knowledge or by carrying out a pilot-study. • When this information is not available, the value that maximizes sample size can be used (usually 50% prevalence). |
Intracluster correlation coefficient (ICC) | • For a clustered study design, the level to which individuals from the same cluster have correlated outcomes, due to (i) similar behaviors and risk factors, (ii) the cluster itself introducing correlations, or (iii) the process of disease transmission introducing correlations. • Leads to diminishing returns when sampling more people from the same cluster, and favors instead larger numbers of clusters. • Reasonable estimates of the ICC can be obtained from the literature or from expert knowledge. Pilot studies will often be underpowered to estimate the ICC. |
Design effect () | • Ratio of the variance of a statistic with a complex sample design to the variance under simple random sampling. Larger values represent less efficient designs, with 1 representing perfect efficiency. • A large design effect needs to be compensated by an increase in sample size. • In general, a design effect of 1.5 or 2 is considered reasonable. However, much larger values are plausible in practice, and values should be tailored to the study where possible. • For clustered surveys, the design effect can be calculated from the ICC through the formula , where is the per cluster sample size. |
Significance level (Alpha) | • A predetermined threshold that determines the strength of evidence required to reject the null hypothesis. • Represents the maximum acceptable probability of making a type I error (false positive). • A value of is often used. Smaller values need to be compensated by larger sample sizes. |
Statistical power (1-Beta) | • The probability of correctly rejecting the null hypothesis when it is indeed false. • A value between 80% and 90% is usually used. • The greater the power, the larger the sample size required. |
Margin of error (MOE) | • A measure of the amount of sampling error we expect in the results of our survey. • Can be used to select sample sizes in cases where no hypothesis test is being performed. • The smaller the MOE, the larger the sample size required. |