CHART 1.
Description of different parameters to be considered in the calculation of sample size for a study aiming at estimating the frequency of health ouctomes, behaviors or conditions
Parameter | Description | Remark |
---|---|---|
Population size | Total population size from which the sample will be drawn and about which researchers will draw conclusions (target population) | Information regarding population size may be obtained based on secondary data from hospitals, health centers, census surveys (population, schools etc.). |
The smaller the target population (for example, less than 100 individuals), the larger the sample size will proportionally be. | ||
Expected prevalence of outcome or event of interest | The study outcome must be a percentage, that is, a number that varies from 0% to 100%. | Information regarding expected prevalence rates should be obtained from the literature or by carrying out a pilot-study. |
When this information is not available in the literature or a pilot-study cannot be carried out, the value that maximizes sample size is used (50% for a fixed value of sample error). | ||
Sample error for estimate | The value we are willing to accept as error in the estimate obtained by the study. | The smaller the sample error, the larger the sample size and the greater the precision. In health studies, values between two and five percentage points are usually recommended. |
Significance level | It is the probability that the expected prevalence will be within the error margin being established. | The higher the confidence level (greater expected precision), the larger will be the sample size. This parameter is usually fixed as 95%. |
Design effect | It is necessary when the study participants are chosen by cluster selection procedures. This means that, instead of the participants being individually selected (simple, systematic or stratified sampling), they are first divided and randomly selected in groups (census tracts, neighborhood, households, days of the week, etc.) and later the individuals are selected within these groups. Thus, greater similarity is expected among the respondents within a group than in the general population. This generates loss of precision, which needs to be compensated by a sample size adjustment (increase). | The principle is that the total estimated variance may have been reduced as a consequence of cluster selection. The value of the design effect may be obtained from the literature. When not available, a value between 1.5 and 2.0 may be determined and the investigators should evaluate, after the study is completed, the actual design effect and report it in their publications. |
The greater the homogeneity within each group (the more similar the respondents are within each cluster), the greater the design effect will be and the larger the sample size required to increase precision. In studies that do not use cluster selection procedures (simple, systematic or stratified sampling), the design effect is considered as null or 1.0. |