Skip to main content
. 2014 Jul-Aug;89(4):609–615. doi: 10.1590/abd1806-4841.20143705

CHART 3.

Description of different parameters to be considered in the calculation of sample size for a study aiming at estimating the frequency of health ouctomes, behaviors or conditions

Parameter Description Remark
Type I or Alpha error It is the probability of rejecting H0, when H0 is false in the target population. Usually fixed as 5%. It is expressed by the p value. It is usually 5% (p<0.05).
For sample size calculation, the confidence level may be adopted (usually 95%), calculated as 1-Alpha.
The smaller the Alpha error (greater confidence level), the larger will be the sample size.
Statistical Power (1-Beta) It is the ability of the test to detect a difference in the sample, when it exists in the target population. Calculated as 1-Beta.
The greater the power, the larger the required sample size will be.
A value between 80%-90% is usually used.
Relationship between non-exposed/exposed groups in the sample It indicates the existing relationship between non-exposed and exposed groups in the sample. For observational studies, the data are usually obtained from the scientific literature. In intervention studies, the value 1:1 is frequently adopted, indicating that half of the individuals will receive the intervention and the other half will be the control or comparison group. Some intervention studies may use a larger number of controls than of individuals receiving the intervention.
The more distant this ratio is from one, the larger will be the required sample size.
Prevalence* of outcome in the non-exposed group** (percentage of positive among the non-exposed) Proportion of individuals with the disease (outcome) among those non-exposed to the risk factor (or that are part of the control group). Data usually obtained from the literature. When this information is not available but there is information on general prevalence/incidence in the population, this value may be used in sample size calculation (values attributed to the control group in intervention studies) or estimated based on the following formula: PONE=pO/(pNE+(pE*PR) )
    where pO = prevalence of outcome; pNE = percentage of non-exposed; pE = percentage of exposed; PR = prevalence* ratio (usually a value between 1.5 and 2.0).
Expected prevalence* ratio Relationship between the prevalence* of disease in the exposed (intervention) group and the prevalence* of disease in the non-exposed group, indicating how many times it is expected that the prevalence* will be higher (or lower) in the exposed compared to non-exposed group. It is the value that the investigators intend to find as HA, with the corresponding H0 equal to one (similar prevalence* of the outcome in both exposed and non-exposed groups). For the sample size estimates, the expected outcome prevalence* may be used for the non-exposed group, or the expected difference in the prevalence* between the exposed and the non-exposed groups.
Usually, a value between 1.50 and 2.00 is used (exposure as risk factor) or between 0.50 and 0.75 (protective factor).
For intervention studies, the clinical relevance of this value should be considered.
The smaller the prevalence rate (the smaller the expected difference between the groups), the larger the required sample size.
Type of statistical test The test may be one-tailed or two-tailed, depending on the type of the HA. Two-tailed tests require larger sample sizes
*

It may be prevalence, incidence or risk, according to type of study;

**

Non-exposed or control group;

Ho - null hypothesis; Ha - alternative hypothesis