ABSTRACT
Clinical research validity depends critically on sound sampling methodology and adequate sample size determination, yet many published studies demonstrate deficiencies in these fundamental aspects. This educational review addresses sampling techniques and sample size calculations in clinical research. The review covers probability sampling approaches, including simple random, systematic random, stratified, and cluster sampling methods, contrasting these with non‐probability techniques such as convenience, purposive, snowball, and quota sampling. For each method, we discuss implementation strategies, inherent biases, and appropriate clinical applications. Sample size determination principles are presented across multiple study designs, encompassing cross‐sectional prevalence studies, case–control investigations, cohort studies, randomized controlled trials, and correlational analyses. Key statistical concepts, including Type I and Type II errors, statistical power, effect size estimation, and variance considerations, are also explained. Additionally, some available software tools for sample size calculation are outlined to facilitate implementation. This review ultimately provides clinical researchers with essential knowledge to make informed methodological decisions that enhance study quality and contribute to the evidence base for healthcare decision‐making.
Keywords: biostatistics, clinical research design, sample size calculation, sampling methodology, statistical power
1. Introduction
Clinical research serves as the cornerstone of evidence‐based healthcare, driving advances in diagnosis, treatment, and prevention strategies that ultimately benefit patient populations worldwide. The translation of clinical questions into methodologically sound research studies requires careful consideration of numerous design elements, among which sampling methodology and sample size determination represent a critical foundation that profoundly influences study validity, reliability, and generalizability [1, 2, 3, 4].
In clinical research, the population is defined as a group of individuals who share a common characteristic or condition, typically a disease [5]. The process of participant selection, also known as sampling, involves the extraction of a finite subset of individuals, or a “sample”, from the target population, with the fundamental goal of generating findings that can be reliably generalized to the broader population of clinical interest [6]. This methodological process, however, must balance the theoretical ideal of perfect representativeness with some practical constraints, such as resource limitations, temporal factors, ethical considerations, and population accessibility.
Contemporary medical research demonstrates some concerning gaps between methodological requirements and actual practice, with numerous published studies failing to provide adequate justification for their sampling approaches or transparent documentation of sample size calculations [7, 8, 9]. This may compromise the scientific rigor of the research and impede the ability of clinicians, policymakers, and subsequent researchers to evaluate study quality accurately and interpret findings appropriately.
The implications of suboptimal sampling methodology, on the other hand, extend beyond individual studies to affect the broader research landscape. Poorly conceived sampling strategies introduce selection bias, compromise external validity, and reduce the clinical applicability of research findings to real‐world patient populations. Similarly, problematic sample size determination, seen in many published articles, can result in underpowered studies that fail to detect clinically meaningful effects, or conversely, overpowered studies that identify statistically significant but clinically irrelevant differences while unnecessarily exposing study participants to some potential risks. In addition, the increasing complexity of modern clinical research, characterized by multi‐center collaborations, diverse patient populations, and sophisticated analytical approaches, demands a clearer understanding of sampling methodologies and sample size calculation essential principles.
This educational review addresses the critical need for practical guidance on sampling methods and sample size determination in clinical research. We aim to bridge the gap between statistical theory and practical application, and to provide researchers with the knowledge and tools necessary to make methodological decisions that could ultimately enhance the quality and impact of their research contributions.
2. Theoretical Foundations of Sampling in Clinical Research
2.1. Conceptual Framework
The theoretical underpinnings of sampling methodology in clinical research rest upon the fundamental principle that a carefully selected subset of individuals can provide valid statistical inferences about characteristics of a larger target population [4]. This principle requires clear delineation between the target population (the complete group about which researchers wish to draw conclusions), the accessible population (the subset actually available for study), and the study sample (the specific individuals ultimately enrolled in the research).
Sampling methods can be broadly classified as either probability (random) or nonprobability sampling. In probability sampling, participants are chosen randomly, giving every individual in the population an equal chance of being selected [10]. In contrast, nonprobability sampling involves the researcher intentionally selecting participants based on specific research objectives [11].
The sampling frame (when the whole population is accessible) represents the operational foundation from which individual participants are selected, typically consisting of a comprehensive listing of potential study participants [5]. It is important to know that the quality and completeness of the sampling frame directly influence the representativeness of the resulting sample and the validity of statistical inferences drawn from the study findings.
2.2. Validity Considerations
Sampling methodology profoundly impacts both internal and external validity of clinical research [4]. Internal validity refers to the degree to which the study's findings accurately reflect the true relationships within the population being investigated [12], while external validity (generalizability) concerns the extent to which findings can be appropriately extrapolated to broader populations or different contexts [13].
Random sampling enhances external validity by reducing selection bias and ensuring that sample characteristics approximate those of the target population. However, it is important to know that randomization in participant selection differs fundamentally from randomization in treatment assignment (typically in randomized controlled trials), which primarily serves to enhance internal validity by balancing known and unknown confounding factors across the study groups.
2.3. Bias Minimization
Effective sampling strategies must address multiple potential sources of bias that can systematically distort study findings. Selection bias occurs when the sampling process excludes certain segments of the target population, resulting in samples that inadequately represent the intended population [14]. Participation bias, also known as nonresponse bias, is a common source of error in clinical trials and survey research. It occurs when the sample is disproportionately composed of individuals with certain characteristics that may influence participation, attrition, or outcomes, leading to a group that does not accurately represent the broader population and limiting generalizability [15]. A related form to the latter, self‐selection bias, arises when individuals selectively enroll themselves in a study. Response bias, on the other hand, occurs when participants provide inaccurate or misleading answers for various reasons [15].
3. Probability Sampling Methods
3.1. Simple Random Sampling
Simple random sampling represents the fundamental probability sampling approach, wherein each individual within the sampling frame possesses an equal and independent probability of selection for study participation [16]. Implementation of this approach typically involves random number generation or lottery‐based selection methods applied to a comprehensive list of potential participants (or the sampling frame) [5]. This approach minimizes selection bias and provides the strongest foundation for generalizing study findings to the broader target population. However, some practical limitations must be considered, including the requirement for complete sampling frames, potential difficulties in ensuring adequate representation of minority subgroups, and possible difficulties when studying rare conditions or geographically dispersed populations.
For example, consider a multi‐center study examining the effectiveness of a new cardiac rehabilitation program. Simple random sampling would involve creating a comprehensive list of all eligible patients across participating centers and using random selection to determine study participants. However, while this approach maximizes representativeness, it may result in unequal distribution across centers or inadequate representation of certain patient subgroups.
3.2. Systematic Random Sampling (Interval Sampling)
In this method, researchers choose participants according to a predetermined rule that follows a fixed interval [5]. For example, if the rule is to include every 3rd patient, the sample would consist of patients numbered 3, 6, 9, 12, 15, and so forth. While this method usually relies on a sampling frame, it may not be necessary in some situations. For instance, if patients regularly visit a specific hospital or clinic, the researcher can randomly select a starting patient and then choose subsequent participants at fixed intervals [5].
While systematic random sampling is efficient and straightforward to implement because of its regular intervals, it carries a risk of bias if there is an inherent pattern in the population that coincides with the sampling interval [4]. For example, if patient records are organized by admission day and certain conditions are more likely to present on specific days of the week, systematic sampling could inadvertently over‐ or under‐represent certain patient types.
3.3. Stratified Sampling
Stratified sampling involves dividing the target population into distinct strata based on specific characteristics, after which samples are drawn from each stratum using either simple or systematic sampling methods. The number of individuals selected from each stratum can be determined as a fixed count or proportionate to the size of the stratum [6]. This method is a modification of simple random sampling and, as such, also requires the availability of a complete sampling frame. It ensures adequate representation of key subgroups and often provides increased precision compared to simple random sampling, especially when sampling from minority or underrepresented populations [5].
Importantly, the selection of appropriate stratification variables requires careful consideration of certain factors that are both clinically relevant and can be readily identifiable within the sampling frame. Common stratification variables in clinical research include demographic characteristics (age, gender, ethnicity), disease severity indicators, or some institutional characteristics in the case of multi‐center studies.
Researchers may hear of proportional and disproportional stratified sampling as variants of the stratified sampling method. Proportional stratified sampling maintains the original population proportions within each stratum, while disproportional stratified sampling deliberately oversamples smaller strata to ensure adequate representation for subgroup analyses. For instance, in a study assessing the effectiveness of a new hypertension medication, patients could be stratified by age group (e.g., 40–49, 50–59, 60–69, and 70+). Proportional stratified sampling would select participants from each age group according to their representation in the overall patient population, preserving the natural distribution. On the other hand, disproportional stratified sampling might intentionally oversample smaller age groups, such as patients aged 70 and above, to ensure sufficient data for reliable subgroup analysis and meaningful comparisons across all age categories.
3.4. Cluster Sampling
Cluster sampling involves randomly selecting groups (clusters) of individuals rather than selecting individuals directly, making it particularly valuable when individual‐level sampling frames are unavailable or when geographical dispersion makes individual recruitment impractical [4, 5]. Clusters may be defined geographically (communities, regions), institutionally (hospitals, clinics), or administratively (departments, practice groups).
Multi‐stage cluster sampling extends this approach by implementing multiple levels of selection, such as first randomly selecting regions, then hospitals within selected regions, then departments within selected hospitals, and finally patients within selected departments. For example, a study evaluating post‐operative recovery outcomes might first select several hospitals across a country, then randomly choose surgical wards within those hospitals, and finally select patients from each ward.
4. Non‐Probability Sampling Method
4.1. Convenience Sampling
Convenience sampling, the most frequently employed non‐probability method in clinical research, involves recruiting participants based on their accessibility and availability to the research team [5]. This approach is widely utilized in clinical settings where researchers recruit patients from specific hospitals, clinics, or treatment centers without random selection procedures.
While convenience sampling offers significant advantages in terms of feasibility, cost‐effectiveness, and rapid enrollment, it introduces substantial limitations regarding external validity and generalizability. In this scenario, the resulting samples may systematically differ from the broader population of interest, particularly with respect to demographic characteristics, disease severity, treatment‐seeking behaviors, or healthcare access patterns. Therefore, findings from studies using convenience sampling can be generalized solely to the (sub)population from which the sample is obtained, and not to the entire population [17].
In the case of hospital‐based or clinic‐based convenience sampling, researchers have to consider referral bias, as patients seeking care at tertiary care centers may differ systematically from those receiving care in community settings. Similarly, convenience samples may overrepresent motivated patients willing to participate in research while underrepresenting those with limited time, resources, or trust in medical research. Thus, researchers should explicitly address in the manuscript or report how the use of a convenience sample may have introduced bias into the estimates, potentially leading to an overestimation or underestimation of the outcome in the studied population [18].
4.2. Purposive (Judgmental) Sampling
Purposive sampling involves the deliberate selection of participants based on specific characteristics or criteria deemed important by the researcher [5]. This approach may be particularly valuable when the population is small and clearly meets the study's requirements, or when refining samples from other methods, as it relies on the researcher's expertise to select the most relevant participants [19]. While this method can provide valuable, targeted data relevant to specific research questions, it, however, introduces potential bias based on the researcher's judgment and limits the ability to make statistical inferences to broader populations [4].
4.3. Snowball Sampling
Snowball sampling utilizes existing participants to recruit additional participants from their social or professional networks, creating a chain‐referral process that can be particularly valuable when studying hard‐to‐reach populations or stigmatized conditions [4, 20]. This approach leverages trust relationships and insider knowledge to access populations that might be difficult to reach through traditional sampling methods.
The primary limitation of snowball sampling involves the potential for bias introduced by network effects, as participants tend to refer individuals similar to themselves in terms of demographics, attitudes, or experiences. This can result in samples that lack diversity and may not represent the full spectrum of the target population. Additionally, the non‐random nature of snowball sampling precludes statistical inference to broader populations.
4.4. Quota Sampling
Quota sampling is a non‐probability sampling method where researchers divide the population into exclusive subgroups based on specific characteristics and then select participants from each stratum until a predetermined quota is filled [6]. This approach is widely employed in market research, public opinion polls, and epidemiological studies where probability sampling is impractical due to constraints of time, cost, or the lack of a complete sampling frame.
Quota sampling is practical and efficient for capturing population diversity across key characteristics, but its non‐random selection process introduces a high risk of bias, limiting the representativeness of the sample [4].
5. Sample Size Determination: Principles and Applications
5.1. Background
Sample size determination represents one of the fundamental methodological considerations in biomedical and clinical research, serving as a critical determinant of study validity, ethical conduct, and resource allocation [21]. The critical importance of appropriate sample size calculation has gained heightened recognition within the framework of evidence‐based medicine, where the strength of scientific evidence is directly contingent upon the statistical soundness of supporting research [21, 22]. As Ioannidis [23] demonstrated in his seminal work on research reliability, inadequate statistical power, which often arises from insufficient sample sizes, contributes significantly to the production of false research findings, undermining the credibility of scientific literature.
One of the key challenges researchers face when determining sample sizes is achieving an optimal balance between statistical adequacy and practical constraints while maintaining ethical research standards. Studies with insufficient sample sizes suffer from reduced statistical power, increasing the likelihood of Type II errors and potentially failing to detect clinically meaningful treatment effects [24, 25]. This “power failure”, as described by Button and colleagues [26], not only undermines the reliability of individual studies but also reduces the reproducibility of findings. Conversely, unnecessarily large studies waste resources and may expose participants to potential harm [27].
The methodological complexity of sample size calculation stems from its dependence on multiple interconnected parameters, including study design characteristics, outcome variable types, effect size specifications, statistical power requirements, and significance levels [8, 28]. As Wilson and Morgan [29] outlined in their comprehensive analysis of power calculations, different research paradigms require distinct approaches to sample size estimation. Cross‐sectional prevalence studies, randomized controlled trials, cohort investigations, case–control studies, and even experimental animal research each necessitate specific mathematical formulations and statistical considerations tailored to their unique methodological frameworks [21, 30].
Moreover, contemporary challenges in sample size calculation practice reflect persistent gaps between statistical theory and practical application. Despite extensive methodological literature and increasingly sophisticated computational tools, published research frequently demonstrates inadequate reporting of sample size rationales or inappropriate application of calculation methods [9, 31, 32]. This disconnect has prompted calls for improved statistical education and standardized reporting requirements, as evidenced by enhanced guidelines from major journals and funding agencies requiring explicit sample size justifications. Prominent reporting guidelines such as CONSORT [33] and STROBE [34] also mandate this, emphasizing that a justified sample size enhances a study's credibility and the reliability of findings.
5.2. Fundamental Statistical Concepts
Sample size determination represents a critical component of research design that requires integration of statistical principles with clinical knowledge and practical considerations. The primary objective typically involves determining the minimum number of participants necessary to detect clinically meaningful effects with adequate statistical power while avoiding unnecessarily large samples that may be ethically problematic or resource‐intensive.
Statistical power, defined as the probability of rejecting a false null hypothesis, is calculated as 1 minus the Type II error rate (β) [8]. Conventional standards typically require 80% or 90% power, meaning the study has an 80% or 90% probability of detecting the hypothesized effect if it truly exists in the population. Higher power levels provide greater assurance of detecting true effects but require larger sample sizes.
Type I error (α) refers to the probability of incorrectly rejecting a true null hypothesis or accepting the alternative hypothesis when it is not true in the population [8, 35]. The alpha level sets the likelihood of committing a Type I error and is conventionally set at 0.05 (5%) in most clinical research studies. Type II error (β) represents the probability of failing to reject a false null hypothesis, typically set at 0.20 (20%) for 80% power or 0.10 (10%) for 90% power [36].
5.3. Effect Size Determination
Effect size quantifies the magnitude of the difference or association that is considered clinically or practically meaningful, representing perhaps the most challenging aspect of sample size calculation [9, 37]. Researchers must specify the minimum clinically important difference they wish to detect, requiring integration of clinical expertise, patient perspectives, and existing literature.
Effect sizes can be expressed in various formats depending on the study design and outcome measures. For continuous outcomes, effect sizes are typically expressed as mean differences or standardized mean differences. For dichotomous outcomes, effect sizes may be expressed as absolute risk differences, relative risks, or odds ratios. For time‐to‐event outcomes, hazard ratios represent the primary effect size measure. When specific effect sizes are unknown, researchers may rely on standardized effect size conventions (small, medium, large), estimates derived from systematic reviews or meta‐analyses, or data from similar published studies. However, the clinical meaningfulness of standardized effect sizes varies considerably across different medical conditions and patient populations, necessitating careful consideration of clinical context.
5.4. Variance Estimation
In some clinical research scenarios, for continuous outcomes, sample size calculations require accurate estimates of the population variance or standard deviation [38]. These estimates can be obtained from previous studies examining similar populations and outcomes, pilot data from the research team, or theoretical considerations based on the measurement properties of outcome instruments. The accuracy of variance estimates directly influences the adequacy of calculated sample sizes. Underestimating variance results in underpowered studies that may fail to detect true effects, while overestimating variance leads to unnecessarily large sample sizes that waste resources and may expose additional participants to research risks without scientific benefit.
6. Sample Size Calculation Across Study Designs
6.1. Cross‐Sectional Studies
Cross‐sectional studies, also referred to as prevalence studies or surveys, aim to estimate the frequency of specific characteristics, conditions, or outcomes within a defined population at a particular point in time. Prevalence, within a defined population, refers to the proportion of people who have a certain disease or characteristic at a given point in time or across a specified period. In epidemiology, it differs from incidence, as prevalence includes both existing and past cases present during the assessment, whereas incidence only measures new cases that arise [39]. Point prevalence indicates the proportion of individuals with a condition at a single moment in time, essentially capturing the percentage affected on a specific date. In contrast, period prevalence describes the proportion of individuals who experience the condition at any time during a designated timeframe [39].
Sample size calculation for cross‐sectional studies depends on several key parameters: the expected prevalence rate, desired precision (margin of error), confidence level, and population size [40]. Population size may be classified as either finite (when the exact number is known) or infinite (when it is assumed to be unknown). The formula used to determine the sample size varies depending on which assumption is applied. In most cases, researchers treat the population as infinite, except when the actual target population is relatively small [30].
When the objective is to estimate prevalence, the formula for determining sample size is relatively straightforward and can be found in many standard references. The formula [41] is as follows: , where n represents the sample size, Z is the statistical value corresponding to the chosen confidence level, P is the anticipated prevalence (which may be derived from previous studies or a pilot study), and d denotes the precision (related to effect size). A 95% confidence level is most commonly used, with results typically reported using a 95% confidence interval (CI). Researchers seeking greater certainty may opt for a 99% confidence interval instead [42]. However, the use of the formula by researchers is not encouraged, as manual calculation may lead to human error. Instead, available software can be utilized, allowing attention to be focused on carefully selecting the appropriate parameters for the calculation [43].
6.2. Case–Control Studies
Case–control studies compare exposure histories between individuals with a specific outcome (cases) and those without the outcome (controls), making them particularly valuable for investigating rare diseases or conditions with long latency periods [44]. These studies are retrospective by design and focus on identifying factors that may contribute to disease development.
The odds ratio (OR) is the commonly used measure of association in case–control studies [45]. It represents the ratio of the odds of an event occurring in the exposed group to the odds of it occurring in the unexposed group. An OR greater than 1 suggests increased odds of the outcome with exposure, whereas an OR less than 1 indicates reduced odds [46]. The larger the OR, the stronger the association between the exposure and the outcome.
Matching is a key technique in case–control studies, used to ensure that the groups being compared have similar distributions of certain characteristics. This practice, by comparing “like with like”, aims to improve the statistical efficiency and cost‐effectiveness of a study [47]. The design of a case–control study can be either unmatched (independent) or matched (dependent). An unmatched design involves selecting a shared control group at random for all cases, based on predefined attributes. In contrast, a matched case–control study selects controls on a case‐by‐case basis, pairing them with individual cases to ensure similarity in key confounding factors such as age and sex.
Matching can be a simple 1:1 ratio, but in studies using large electronic health databases, it is often possible to match one case with five or even 10 controls to increase the study's efficiency. The data from these studies are typically analyzed using logistic regression, which adjusts for confounding variables to determine if a past event or exposure is significantly associated with the outcome, or “caseness” [48]. For studies with well‐matched cases and controls, a specific approach known as conditional logistic regression analysis may be employed [49].
The requirements for sample size calculation in an independent case–control study are based on either odds ratios (OR) or two proportions, which represent the exposure rates in the case and control groups. To perform the calculation, several key parameters must be specified: the probability of exposure in controls, and either the expected OR or the probability of exposure in cases. In addition, the calculation requires the desired statistical power, typically set at 0.8, 0.85, or 0.9, and the alpha level, which is usually 0.05. Finally, the number of controls per case must be determined, with a 1:1 ratio often used for equal groups [50].
6.3. Cohort Studies
Cohort studies are a type of longitudinal observational study that follows a defined group of people with a common characteristic over a period of time, often many years [51, 52]. These studies are particularly valuable in epidemiology as they help identify the factors that raise or lower the risk of developing a disease [52]. A key strength is their ability to establish the temporal causality between an exposure and an outcome, as participants do not have the outcome at the study's beginning [53, 54].
Cohort studies can be conducted prospectively, following participants from the present to the future, or retrospectively, using existing historical data. While prospective studies allow investigators to determine the incidence of new cases over time, they are often complex, time‐consuming, and may be expensive to conduct [51, 54]. Retrospective studies, while more pragmatic and less expensive, carry a higher risk of bias due to missing or incomplete data, are vulnerable to recall bias, and offer less control over study variables [51, 54].
Relative risk (RR) is the ratio of the probability of an event occurring in the exposed group to the probability of the event occurring in the unexposed group. It is often confused with the odds ratio or absolute risk, but, unlike these measures, RR directly compares event probabilities between groups. Calculating relative risk requires knowledge of each individual's exposure status and is thus frequently reported in prospective cohort studies, where both exposure and disease incidence can be accurately measured [55].
The sample size for an independent cohort study is calculated by comparing two proportions, which represent the event rates in the exposed and non‐exposed groups. To perform this calculation, you must specify the probability of an event in the non‐exposed group, and either the expected risk ratio (RR) or the probability of an event in the exposed group. The desired statistical power, typically 0.8, 0.85, or 0.9, and the alpha level, usually 0.05, are also required. Finally, you must determine the ratio of the unexposed to the exposed group, which is 1 for equal groups [56].
Prospective cohort studies, however, must carefully consider anticipated attrition rates over the study period, particularly for studies with extended follow‐up duration. Loss to follow‐up can substantially reduce statistical power and may introduce bias if losses are differential between exposure groups. Thus, sample size calculations should incorporate realistic estimates of attrition based on similar studies in comparable populations if available.
6.4. Randomized Controlled Trials
Randomized controlled trials (RCTs) are widely regarded as the gold standard for evaluating therapeutic interventions, as their design minimizes bias and allows for the most robust assessment of causal relationships in clinical research [57, 58]. Importantly, designing an RCT requires careful consideration of numerous factors, particularly when determining sample size. Various RCT designs and features can be employed to address specific research hypotheses, and these designs have become increasingly diverse as methods evolve to tackle more complex scientific questions [58].
Superiority trials aim to demonstrate that one treatment is superior to another, requiring sample sizes calculated to detect the minimum clinically important difference with adequate power. The determination of clinically important differences requires integration of clinical judgment, patient preferences, and consideration of treatment risks and costs.
In superiority clinical trials that compare two interventions with a continuous endpoint expressed as a mean difference, sample size calculation requires several key specifications. These include the expected effect size (ES), the chosen trial design (crossover or parallel), the allocation ratio between experimental and control groups, as well as the desired statistical power and alpha level [59]. For continuous outcomes, the ES is determined by the expected mean difference between the two groups and the standard deviation of that difference. However, in the case of binary outcomes (events such as remission or death), the ES corresponds to the expected event rates in each group [59].
Non‐inferiority trials aim to demonstrate that a new treatment is not substantially worse than an established treatment, while equivalence trials seek to demonstrate that two treatments produce similar effects within predefined equivalence margins. The choice of non‐inferiority or equivalence margins represents a critical design decision that balances clinical judgment with regulatory considerations.
The sample size calculation for non‐inferiority or equivalence trials requires specifying the expected effect size (ES), the trial design (crossover or parallel), the allocation ratio between experimental and control groups, the desired statistical power, alpha level, as well as the non‐inferiority or equivalence margin [60]. For continuous outcomes, the ES is defined as the expected mean difference between groups and the standard deviation of that difference, whereas for binary outcomes (events such as remission or death), it corresponds to the expected event rates in each group [60].
6.5. Sample Size Calculation in Correlational Studies
Sample size calculation for a clinical study aimed at determining the correlation coefficient between two variables can be applied to various study designs, including cross‐sectional, cohort, case–control, or clinical trials. The key requirements for this calculation are the expected correlation coefficient, the statistical power, the alpha level, and the correlation coefficient for the null hypothesis, which is 0 or 0.2, typically [61].
The correlation coefficient (for the alternative hypothesis) can be obtained from a previous similar study, which is preferred as it provides an empirically derived value relevant to the research context. If no prior data are available, a value of 0.3 may be assumed for the correlation coefficient, especially when the aim is to determine whether the two variables exhibit a sizable or significant correlation [62, 63].
7. Valuable Tools for Sample Size Calculation
Calculating an appropriate sample size is a critical step in research design, but the underlying formulas can be complex. To simplify this process, a variety of software and web‐based tools have been developed, each with distinct features and applications.
G*Power [64] represents one of the most widely adopted statistical software packages, providing comprehensive power analysis capabilities for social, behavioral, and biomedical sciences. The software's intuitive graphical user interface streamlines the process of calculating sample size and power for numerous statistical methods [65].
StatsDirect software [66] represents another valuable resource for clinical researchers, offering integrated statistical analysis capabilities alongside sample size calculation functions. It allows researchers to specify essential parameters such as significance level, desired power, and effect size, providing comprehensive support for a variety of study designs, encompassing case–control and cohort studies, survival analyses, and surveys, making the process of determining an appropriate sample size both straightforward and efficient.
Lastly, OpenEpi [67] stands out as a valuable open‐source online calculator specifically designed for epidemiological statistics in public health research. It provides sample size and power calculations for a variety of study designs, including cross‐sectional studies, unmatched case–control studies, cohort studies, and randomized controlled trials. Its user‐friendly interface and accessibility make it an excellent complement to other statistical software, as well as a valuable tool for teaching and applied practice, making it a valuable resource for both researchers and educators.
Author Contributions
Azzam Zrineh: conceptualization, methodology, writing – original draft, writing – review and editing. Maysa Al‐Usta: writing – original draft, writing – review and editing. Abdallah Alwawi: writing – original draft, writing – review and editing, supervision.
Funding
The authors have nothing to report.
Ethics Statement
The authors have nothing to report.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
AI‐assisted tools were used during the preparation of this manuscript solely for language editing and readability enhancement. Grammarly was utilized for grammar checking and sentence structure refinement. ChatGPT was consulted in a limited manner during the revision stage to support rephrasing for clarity and readability only. The authors confirm that no AI tools were used for methodological decisions, scientific interpretation, or generation of original content. All scientific content and conclusions are the sole responsibility of the authors.
Zrineh A., Al‐Usta M., and Alwawi A., “Sampling Methods and Sample Size Determination in Clinical Research: An Educational Review,” Journal of General and Family Medicine 27, no. 1 (2026): e70096, 10.1002/jgf2.70096.
Data Availability Statement
This is a narrative educational review that does not involve original research data.
References
- 1. Suresh K., Thomas S., and Suresh G., “Design, Data Analysis and Sampling Techniques for Clinical Research,” Annals of Indian Academy of Neurology 14 (2011): 287–290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Faber J. and Fonseca L. M., “How Sample Size Influences Research Outcomes,” Dental Press Journal of Orthodontics 19 (2014): 27–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Wang X. and Ji X., “Sample Size Estimation in Clinical Research,” Chest 158 (2020): S12–S20. [DOI] [PubMed] [Google Scholar]
- 4. Ahmed S. K., “How to Choose a Sampling Technique and Determine Sample Size for Research: A Simplified Guide for Researchers,” Oral Oncology Reports 12 (2024): 100662. [Google Scholar]
- 5. Elfil M. and Negida A., “Sampling Methods in Clinical Research; an Educational Review,” Emergency (Tehran) 5 (2017): e52. [PMC free article] [PubMed] [Google Scholar]
- 6. Martínez‐Mesa J., González‐Chica D. A., Duquia R. P., Bonamigo R. R., and Bastos J. L., “Sampling: How to Select Participants in My Research Study?,” Anais Brasileiros de Dermatologia 91 (2016): 326–330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Rudolph J. E., Zhong Y., Duggal P., Mehta S. H., and Lau B., “Defining Representativeness of Study Samples in Medical and Population Health Research,” BMJ Medicine 2 (2023): e000399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Serdar C. C., Cihan M., Yücel D., and Serdar M. A., “Sample Size, Power and Effect Size Revisited: Simplified and Practical Approaches in Pre‐Clinical, Clinical and Laboratory Studies,” Biochemia Medica (Zagreb) 31 (2021): 010502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Althubaiti A., “Sample Size Determination: A Practical Guide for Health Researchers,” Journal of General and Family Medicine 24 (2023): 72–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Stratton S. J., “Population Research: Convenience Sampling Strategies,” Prehospital and Disaster Medicine 36 (2021): 373–374. [DOI] [PubMed] [Google Scholar]
- 11. Vaidyanathan A. K., “Randomization in Clinical Research,” Journal of the Indian Prosthodontic Society 24 (2024): 1–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Patino C. M. and Ferreira J. C., “Internal and External Validity: Can You Apply Research Study Results to Your Patients?,” Jornal Brasileiro de Pneumologia 44 (2018): 183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Andrade C., “Internal, External, and Ecological Validity in Research Design, Conduct, and Evaluation,” Indian Journal of Psychological Medicine 40 (2018): 498–499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Arias F. D., Navarro M., Elfanagely Y., and Elfanagely O., “Biases in Research Studies,” in Translational Surgery (Elsevier, 2023), 191–194. [Google Scholar]
- 15. Elston D. M., “Participation Bias, Self‐Selection Bias, and Response Bias,” Journal of the American Academy of Dermatology 85 (2021): 1–2. [DOI] [PubMed] [Google Scholar]
- 16. Noor S., Tajik O., and Golzar J., “Simple Random Sampling,” International Journal of Education and Language Studies 1 (2022): 78–82, 10.22034/ijels.2022.162982. [DOI] [Google Scholar]
- 17. Andrade C., “The Inconvenient Truth About Convenience and Purposive Samples,” Indian Journal of Psychological Medicine 43 (2021): 86–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Setia M. S., “Methodology Series Module 5: Sampling Strategies,” Indian Journal of Dermatology 61 (2016): 505–509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Bhardwaj P., “Types of Sampling in Research,” Journal of the Practice of Cardiovascular Sciences 5 (2019): 157–163. [Google Scholar]
- 20. Ting H., Memon M. A., Thurasamy R., and Cheah J.‐H., “Snowball Sampling: A Review and Guidelines for Survey Research,” Asian Journal of Business Research 15 (2025): 1–15. [Google Scholar]
- 21. Sadiq I. Z., Usman A., Muhammad A., and Ahmad K. H., “Sample Size Calculation in Biomedical, Clinical and Biological Sciences Research,” Journal of Umm Al‐Qura University for Applied Sciences 11 (2025): 133–141. [Google Scholar]
- 22. Charan J. and Biswas T., “How to Calculate Sample Size for Different Study Designs in Medical Research?,” Indian Journal of Psychological Medicine 35 (2013): 121–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Ioannidis J. P. A., “Why Most Published Research Findings Are False,” PLoS Medicine 2 (2005): e124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Shreffler J. and Huecker M. R., “Type I and Type II Errors and Statistical Power,” in StatPearls (StatPearls Publishing, 2025), http://www.ncbi.nlm.nih.gov/books/NBK557530/. [PubMed] [Google Scholar]
- 25. Columb M. and Atkinson M., “Statistical Analysis: Sample Size and Power Estimations,” BJA Education 16 (2016): 159–161. [Google Scholar]
- 26. Button K. S., Ioannidis J. P. A., Mokrysz C., et al., “Power Failure: Why Small Sample Size Undermines the Reliability of Neuroscience,” Nature Reviews. Neuroscience 14 (2013): 365–376. [DOI] [PubMed] [Google Scholar]
- 27. Guo Y., Logan H. L., Glueck D. H., and Muller K. E., “Selecting a Sample Size for Studies With Repeated Measures,” BMC Medical Research Methodology 13 (2013): 100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Das S., Mitra K., and Mandal M., “Sample Size Calculation: Basic Principles,” Indian Journal of Anaesthesia 60 (2016): 652–656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Wilson Van Voorhis C. R. and Morgan B. L., “Understanding Power and Rules of Thumb for Determining Sample Sizes,” Quantitative Methods for Psychology 3 (2007): 43–50. [Google Scholar]
- 30. Ranganathan P., Deo V., and Pramesh C. S., “Sample Size Calculation in Clinical Research,” Perspectives in Clinical Research 15 (2024): 155–159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Dhiman P., Ma J., Qi C., et al., “Sample Size Requirements Are Not Being Considered in Studies Developing Prediction Models for Binary Outcomes: A Systematic Review,” BMC Medical Research Methodology 23 (2023): 188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. AbdulRaheem Y., “Statistics in Medical Research: Common Mistakes,” Journal of Taibah University Medical Sciences 18 (2023): 1197–1199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Hopewell S., Chan A.‐W., Collins G. S., et al., “CONSORT 2025 Statement: Updated Guideline for Reporting Randomised Trials,” PLoS Medicine 22 (2025): e1004587. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Cuschieri S., “The STROBE Guidelines,” Saudi Journal of Anesthesia 13 (2019): S31–S34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Kim H.‐Y., “Statistical Notes for Clinical Researchers: Type I and Type II Errors in Statistical Decision,” Restorative Dentistry and Endodontics 40 (2015): 249–252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Sullivan G. M. and Feinn R. S., “Do You Have Power? Considering Type II Error in Medical Education,” Journal of Graduate Medical Education 13 (2021): 753–756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Kallogjeri D. and Piccirillo J. F., “A Simple Guide to Effect Size Measures,” JAMA Otolaryngology—Head & Neck Surgery 149 (2023): 447–451. [DOI] [PubMed] [Google Scholar]
- 38. Whitehead A. L., Julious S. A., Cooper C. L., and Campbell M. J., “Estimating the Sample Size for a Pilot Randomised Trial to Minimise the Overall Trial Sample Size for the External Pilot and Main Trial for a Continuous Outcome Variable,” Statistical Methods in Medical Research 25 (2016): 1057–1073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Dicker R. C., Coronado F., Koo D., and Parrish R. G., “Principles of Epidemiology in Public Health Practice; an Introduction to Applied Epidemiology and Biostatistics,” 3rd edition, 2006, accessed 20 September 2025, https://stacks.cdc.gov.
- 40. Khaled Fahim N. and Negida A., “Sample Size Calculation Guide ‐ Part 1: How to Calculate the Sample Size Based on the Prevalence Rate,” Advanced Journal of Emergency Medicine 2 (2018): e50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Daniel W. W. and Cross C. L., Biostatistics a Foundation for Analysis in the Health Sciences, 10th ed. (John Wiley & Sons, 2013), https://www.scirp.org/reference/referencespapers?referenceid=3123259. [Google Scholar]
- 42. Pourhoseingholi M. A., Vahedi M., and Rahimzadeh M., “Sample Size Calculation in Medical Studies,” Gastroenterology and Hepatology From Bed to Bench 6 (2013): 14–17. [PMC free article] [PubMed] [Google Scholar]
- 43. Naing L., Nordin R. B., Abdul Rahman H., and Naing Y. T., “Sample Size Calculation for Prevalence Studies Using Scalex and ScalaR Calculators,” BMC Medical Research Methodology 22 (2022): 209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Dey T., Mukherjee A., and Chakraborty S., “A Practical Overview of Case‐Control Studies in Clinical Practice,” Chest 158 (2020): S57–S64. [DOI] [PubMed] [Google Scholar]
- 45. Tenny S. and Hoffman M. R., “Odds Ratio,” in StatPearls (StatPearls Publishing, 2025), http://www.ncbi.nlm.nih.gov/books/NBK431098/. [PubMed] [Google Scholar]
- 46. Andrade C., “Understanding Relative Risk, Odds Ratio, and Related Terms: As Simple as It Can Get,” Journal of Clinical Psychiatry 76 (2015): e857–e861. [DOI] [PubMed] [Google Scholar]
- 47. Iwagami M. and Shinozaki T., “Introduction to Matching in Case‐Control and Cohort Studies,” Annals of Clinical Epidemiology 4 (2022): 33–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Andrade C., “Research Design: Case‐Control Studies,” Indian Journal of Psychological Medicine 44 (2022): 307–309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Kuo C.‐L., Duan Y., and Grady J., “Unconditional or Conditional Logistic Regression Model for Age‐Matched Case‐Control Data?,” Frontiers in Public Health 6 (2018): 57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Fahim N. K., Negida A., and Fahim A. K., “Sample Size Calculation Guide ‐ Part 3: How to Calculate the Sample Size for an Independent Case‐Control Study,” Advanced Journal of Emergency Medicine 3 (2019): e20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Capili B. and Anastasi J. K., “Cohort Studies,” American Journal of Nursing 121 (2021): 45–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Barrett D. and Noble H., “What Are Cohort Studies?,” Evidence‐Based Nursing 22 (2019): 95–96. [DOI] [PubMed] [Google Scholar]
- 53. Setia M. S., “Methodology Series Module 1: Cohort Studies,” Indian Journal of Dermatology 61 (2016): 21–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Song J. W. and Chung K. C., “Observational Studies: Cohort and Case‐Control Studies,” Plastic and Reconstructive Surgery 126 (2010): 2234–2242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Tenny S. and Hoffman M. R., “Relative Risk,” in StatPearls (StatPearls Publishing, 2025), http://www.ncbi.nlm.nih.gov/books/NBK430824/. [PubMed] [Google Scholar]
- 56. Khaled Fahim N. and Negida A., “Sample Size Calculation Guide ‐ Part 2: How to Calculate the Sample Size for an Independent Cohort Study,” Advanced Journal of Emergency Medicine 3 (2019): e12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Zhong B., “How to Calculate Sample Size in Randomized Controlled Trial?,” Journal of Thoracic Disease 1 (2009): 51–54. [PMC free article] [PubMed] [Google Scholar]
- 58. Zabor E. C., Kaizer A. M., and Hobbs B. P., “Randomized Controlled Trials,” Chest 158 (2020): S79–S87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Negida A., Fahim N. K., Negida Y., and Ahmed H., “Sample Size Calculation Guide ‐ Part 5: How to Calculate the Sample Size for a Superiority Clinical Trial,” Advanced Journal of Emergency Medicine 3 (2019): e49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Negida A., “Sample Size Calculation Guide ‐ Part 6: How to Calculate the Sample Size for a Non‐Inferiority or an Equivalence Clinical Trial,” Advanced Journal of Emergency Medicine 4 (2020): e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Negida A., “Sample Size Calculation Guide ‐ Part 7: How to Calculate the Sample Size Based on a Correlation,” Advanced Journal of Emergency Medicine 4 (2020): e34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Bujang M. A. and Baharum N., “Sample Size Guideline for Correlation Analysis,” World Journal of Social Science Research 3 (2016): 37. [Google Scholar]
- 63. Bujang M. A., “An Elaboration on Sample Size Determination for Correlations Based on Effect Sizes and Confidence Interval Width: A Guide for Researchers,” Restorative Dentistry and Endodontics 49 (2024): e21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Faul F., Erdfelder E., Lang A.‐G., and Buchner A., “G*Power 3: A Flexible Statistical Power Analysis Program for the Social, Behavioral, and Biomedical Sciences,” Behavior Research Methods 39 (2007): 175–191. [DOI] [PubMed] [Google Scholar]
- 65. Kang H., “Sample Size Determination and Power Analysis Using the G*Power Software,” Journal of Educational Evaluation for Health Professions 18 (2021): 17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Freemantle N., “CD: StatsDirect—Statistical Software for Medical Research in the 21st Century,” British Medical Journal 321 (2000): 1536. [Google Scholar]
- 67. Sullivan K. M., Dean A., and Soe M. M., “OpenEpi: A Web‐Based Epidemiologic and Statistical Calculator for Public Health,” Public Health Reports 124 (2009): 471–474. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
This is a narrative educational review that does not involve original research data.
