CHART 3.
Description of different parameters to be considered in the calculation of sample size for a study aiming at estimating the frequency of health ouctomes, behaviors or conditions
Parameter | Description | Remark |
---|---|---|
Type I or Alpha error | It is the probability of rejecting H0, when H0 is false in the target population. Usually fixed as 5%. | It is expressed by the p value. It is usually 5% (p<0.05). |
For sample size calculation, the confidence level may be adopted (usually 95%), calculated as 1-Alpha. | ||
The smaller the Alpha error (greater confidence level), the larger will be the sample size. | ||
Statistical Power (1-Beta) | It is the ability of the test to detect a difference in the sample, when it exists in the target population. | Calculated as 1-Beta. |
The greater the power, the larger the required sample size will be. | ||
A value between 80%-90% is usually used. | ||
Relationship between non-exposed/exposed groups in the sample | It indicates the existing relationship between non-exposed and exposed groups in the sample. | For observational studies, the data are usually obtained from the scientific literature. In intervention studies, the value 1:1 is frequently adopted, indicating that half of the individuals will receive the intervention and the other half will be the control or comparison group. Some intervention studies may use a larger number of controls than of individuals receiving the intervention. |
The more distant this ratio is from one, the larger will be the required sample size. | ||
Prevalence* of outcome in the non-exposed group** (percentage of positive among the non-exposed) | Proportion of individuals with the disease (outcome) among those non-exposed to the risk factor (or that are part of the control group). | Data usually obtained from the literature. When this information is not available but there is information on general prevalence/incidence in the population, this value may be used in sample size calculation (values attributed to the control group in intervention studies) or estimated based on the following formula: PONE=pO/(pNE+(pE*PR) ) |
where pO = prevalence of outcome; pNE = percentage of non-exposed; pE = percentage of exposed; PR = prevalence* ratio (usually a value between 1.5 and 2.0). | ||
Expected prevalence* ratio | Relationship between the prevalence* of disease in the exposed (intervention) group and the prevalence* of disease in the non-exposed group, indicating how many times it is expected that the prevalence* will be higher (or lower) in the exposed compared to non-exposed group. | It is the value that the investigators intend to find as HA, with the corresponding H0 equal to one (similar prevalence* of the outcome in both exposed and non-exposed groups). For the sample size estimates, the expected outcome prevalence* may be used for the non-exposed group, or the expected difference in the prevalence* between the exposed and the non-exposed groups. |
Usually, a value between 1.50 and 2.00 is used (exposure as risk factor) or between 0.50 and 0.75 (protective factor). | ||
For intervention studies, the clinical relevance of this value should be considered. | ||
The smaller the prevalence rate (the smaller the expected difference between the groups), the larger the required sample size. | ||
Type of statistical test | The test may be one-tailed or two-tailed, depending on the type of the HA. | Two-tailed tests require larger sample sizes |
It may be prevalence, incidence or risk, according to type of study;
Non-exposed or control group;
Ho - null hypothesis; Ha - alternative hypothesis