Skip to main content
Journal of Postgraduate Medicine logoLink to Journal of Postgraduate Medicine
. 2024 Dec 11;70(4):227–231. doi: 10.4103/jpgm.jpgm_607_24

1. Study designs for making most of the limited resources

A Indrayan 1,
PMCID: PMC11722713  PMID: 39660579

ABSTRACT

Medical research is costly and requires significant effort. While intracellular research hardly follows a formal design, most data-based studies require a structural approach for optimal utilization of resources. Depending on the study’s objectives and available resources, as well as the aim to obtain valid and reliable results, the design may be descriptive, employing specific sampling strategies, or analytical, exploring antecedent-outcome relationships. Analytical studies often involve designs such as randomization and blinding in clinical trials, and may use prospective, retrospective, or cross-sectional designs in observational studies. This first article in the series on biostatistics methods specifies the essential features of each design and describes the contexts in which they are most appropriate.

KEY WORDS: Analytical studies, and cross-sectional studies, clinical trials, descriptive studies, laboratory experiments, observational studies, prospective, retrospective

Introduction

Nobel prizes epitomize groundbreaking research conducted worldwide. The 2023 Nobel prize in medicine was awarded to Kariko and Weissman for their discovery of how mRNA interacts with the immune system, leading to the development of CoViD-19 vaccines.[1] In 2022, Paabo received the Nobel prize for his work on genomes that advanced our understanding of human evolution.[2] These scientists worked intensively in their laboratories and thoroughly examined materials without adhering to a specific design, resulting in significant breakthroughs. However, this approach is primarily applicable to intracellular research. For most of us, medical research involves collecting data from groups of individuals and analyzing that data to extract meaningful insights amidst various variations, including intra-individual, inter-individual, environmental, instrumental, and others. It is for this type of empirical research that a well-defined study design is essential, ensuring that reliable and valid results are obtained while minimizing resource expenditure.

Study design refers to the framework used for collecting data, with the purpose to achieve the predefined primary and secondary objectives. It is a strategic approach intended to achieve robust results while optimizing resource utilization, adhering to ethical standards, and ensuring the pursuit of valid and reliable outcomes, These resources include the type and number of available individuals, access to expertise, instruments, time, and other necessary inputs.

The design tries to control various sources of aleatory and epistemic uncertainties[3] while accounting for antecedents, mediators, confounders, and outcomes. This approach allows for a clear message to be discerned amidst variations and limitations in knowledge. The design must withstand internal and external threats that may arise during the research process.[4]

The design of a study depends on its objectives and is crafted to provide accurate answers to the research questions posed. Medical studies generally have two primary types of objectives. The first is to describe a condition, such as the extent and type of specific health problems prevailing in a group or subgroups of individuals. The second type of objective involves the investigation of antecedent-outcome or cause-effect relationships, which are classified as analytical studies. Analytical studies can be experimental, exploring the effects of human intervention, such as a clinical trials, or observational, where naturally occurring events are studied without human intervention. The topic of study designs is extensive, and comprehensive resources are available, such as the books by Hulley et al.[5] This article can only provide a brief overview of the major designs, with a summary presented in Table 1.

Table 1.

Features of major types of study designs

Design Features Examples Main statistical indices
1.Descriptive Magnitude of the medical condition in various types of cases Profiles, case studies, case series, surveys Percentages, prevalence, their confidence intervals, and regression for trend
2.Analytical Antecedent–outcome relationship As stated next
 2.1 Experiments/Trials Effect of an intentional intervention on a defined outcome Laboratory experiments on biological material and animals, clinical trials on humans for efficacy and side effects of a regimen Difference between groups, relative risk, various durations
 2.2 Observational studies Effect of natural exposure on the outcome Effect of childhood obesity on hypertension in adulthood As stated next
  2.2.1 Prospective Investigation of outcome for a given exposure Incidence of hypertension in adulthood in subjects with and without obesity in childhood Incidence, relative risk, survival, hazard ratio, positive and negative predictivity
  2.2.2 Retrospective Investigation of exposure in cases with and without a particular outcome Childhood obesity in adults with and without hypertension Prevalence, association, odds ratio, sensitivity-specificity, ROC curve
  2.2.3 Cross-sectional Investigation of exposure and outcome in a sample of subjects Investigate 1000 adults for hypertension and their childhood obesity Prevalence, odds ratio, association, correlation, agreement

Descriptive Studies

Sik et al.[6] conducted a multi-centric study through an online survey to evaluate the current status of point-of-care ultrasound (POCUS) in pediatric emergency departments (PEDs) and pediatric intensive care units (PICUs) in Turkey. They reported that approximately 90% of PEDs and PICUs used POCUS for areas such as thoracic and cardiovascular assessment. This study is descriptive in nature, with the primary objective of estimating the percentage of cases using POCUS. Another example of a descriptive study is the epidemiological mapping of cutaneous leishmaniasis in Saudi Arabia by Alharbi and Ahmed[7], which estimated age-gender specific prevalence rates in Saudi and nonSaudi residents. They mistakenly referred to their findings as incidence, not realizing that incidence refers specifically to the occurrence of new cases over a defined period. Other examples of descriptive studies include profile of cases, case reports, case-series, and surveys. Complete enumeration, such as census, is also considered a descriptive study. This does not necessarily refer to a population census; it can involve a census of cases admitted to a hospital for a specific condition within a specified duration, as long as the objective is only to study the profile of those cases.

Statistically, descriptive studies are mostly used to estimate parameters such as prevalence, proportion, percentage, or other related metrics. This study can also be utilized to assess whether the extent of a disease reaches a predefined threshold within a population. Descriptive studies are useful for identifying which levels of a quantitative measurement are common and which are rare (statistically referred to as the distribution), as well as in determining which groups exhibit a higher prevalence of a disease or health condition compared to others. Studies on growth and development of children are popular examples of descriptive studies. Studies that delineate normal levels of various health indicators are also classified as descriptive. When a descriptive study is conducted repeatedly over time, trends in rates can be analyzed; for instance, Flavio-Reis et al.[8] investigated maternal deaths caused by eclampsia in Brazil from 2000 to 2021.

As the primary objective of descriptive studies is to estimate a parameter of interest, sample size becomes important for obtaining a reliable estimate. Precision, which assesses reliability, is statistically measured by the inverse of the width of the confidence interval (CI); a larger sample size results in a smaller CI and a more precise estimate. Statistical power is typically not a consideration in most descriptive studies. However, the validity of the estimate —evaluated by its closeness to the actual value —depends on the representativeness of the sample relative to the target population, rather than solely on sample size. Thus, the method of sampling is important. Both reliability and validity bring us back to the design choices that optimize the use of limited resources. In this context, design refers to both the sample size and the method of sampling. In a subsequent article in this series, we will present further details on these two statistical aspects of descriptive studies.

Analytical Studies

The other major category of designs is analytical studies, which investigate the antecedent-outcome relationship —specifically, how an outcome is affected or influenced by a set of antecedent characteristics. For example, such studies might explore what factors contribute to the occurrence of gastric ulcers in some individuals while preventing the disease in others. Although most of these studies report the presence or absence of associations between one or more factors and the outcome, they often imply a cause-effect relationship. Statistically, the primary aim of these studies is to evaluate the significance of the effect in terms of relative risk, odds ratio, or differences between two or more groups in mean, median, or proportion. These studies can estimate both the gross and net contribution of an antecedent to an outcome and assess which factors are more important and to what extent; however, this crucial aspect is often overlooked. Analytical studies can be classified into experimental studies (including trials) or observational studies.

Experiments and trials

By definition, an experiment involves studying the effect of a specific human intervention on a defined outcome. This can occur in a laboratory setting, such as examining the effect of exposure to mobile phones on learned responses in rats.[9] Experiments require a comparator group, known as the control group, which either does not receive any intervention or receives a different intervention, generally the existing standard treatment. Preclinical trials of various treatment regimens are mostly conducted on animals in laboratory environments, although they can also involve biological materials, such as biopsies, swabs, and blood samples. Laboratory conditions can be standardized, and animals of the same species and strain can be selected to control for variations, allowing researchers to infer a cause-effect relationship.

Experiments involving humans are called clinical trials. These trials present significant challenges due to ethical considerations and the considerable inter- and intra-individual variations, as well as environmental conditions that are rarely standardized and can significantly affect the outcomes. To minimize the effects of these variations, self-controls are sometimes employed, where the post-intervention status of individuals is compared with their pre-intervention status. However, such before-and-after studies mostly fail to correctly assess the net effect because the placebo and self-improvement effects can be confounded. Nonetheless, there have been successful experiments, such as by Marshall’s self-experimentation demonstrating that peptic ulcers can result from H. pylori infection.[10] A variation of this is the n-of-1 trial, in which the same participant is repeatedly given different regimens for diseases that recur when the treatment is stopped for a while. In general, however, parallel controls are required, and random allocation is advocated to ‘ensure’ that the two groups are equivalent at baseline. For small samples, it is advisable to match controls with cases. Random allocation is typically achieved using a computer-generated random sequence. Wherever feasible, blinding is used to minimize bias. In single binding, only the participants are blinded, i.e. they are not informed whether they belong to the intervention group or control group. In double blinding, neither the participant nor the assessor knows to which the group a patient belongs; in triple blinding, even the data analyst remains blind. The double blind randomized controlled trial (RCT) is considered the gold standard design for evaluating the effect of an intervention on an outcome and is often the most cost-effective strategy. Finding no effect or small effect in a trial does not diminish its value, as long as the intervention is deemed worthy of investigating. For effective blinding, it is imperative that the test regimen appears identical to the control regimen, with similar packaging, color, route of administration, taste, and dose. This process is called masking. Another good strategy for controlling bias is concealment, where the person allocating the regimens is unaware of which regimen the next person will receive.

Statistically, a clinical trial involves at least one factor, namely the group (case and control). However, it can also be multifactorial, allowing for the separate study of individuals across different age-groups, sexes, or with various comorbidities to assess the impact of these variables on the outcome. Additionally. crossover trials can be implemented, where some randomly selected individuals receive regimen A followed by regimen B, while others receive regimen B followed by regimen A. This design is particularly suitable for conditions such as hypertension and diabetes, which generally recur when treatment is paused. An appropriate washout period is allowed between treatments in crossover-trials. Additionally, up-and-down trials may be employed, where a drug, such as an analgesic agent, is administered at a particular dose. If that dose does not relieve the pain in the first case, the dose is increased for the next individual; if the dose is effective, it is decreased. This method helps estimate an optimal median dose. There are other variations of trials as well, including pragmatic trials, multi-stage trials, adaptive trials, and superiority and noninferiority trials. For details, please refer to Indrayan and Malhotra.[11]

Observational studies

Some interventions can be harmful; for instance, a person cannot be asked to smoke for 10 years to observe its effect on health. Additionally, certain interventions are unfeasible, such as altering a person’s blood group. Trials can also be expensive due to the challenges of locating suitable cases, finding equivalent controls, obtaining informed consent, administering the intervention, and conducting follow up over a certain period to observe outcomes. Fortunately, there are some natural interventions —generally called exposures —whose effects can be explored. For example, some individuals smoke regardless, and some women naturally have low Hb levels. Observing these individuals retrospectively for antecedents or prospectively for outcomes and assessing the impact of the exposure on the outcome may be a more cost-effective strategy in such cases. Ideally, individuals in the groups under study should be similar, differing in the factors being investigated. If this similarity cannot be ensured, the statistical method of propensity score matching can be employed to filter and identify matching individuals. Observational studies can be conducted in three major formats [Figure 1].

Figure 1.

Figure 1

Types of observational studies

Prospective studies

A study is called prospective when both exposed and nonexposed individuals are followed over time to observe the occurrence of an outcome of interest. An example is the evaluation of operative duration as a predictor of mortality in pediatric emergency surgery[12], where patients with varying durations of surgery were followed up to assess survival or death. Importantly, there is no intentional human intervention in this type of study. Note that merely recruiting cases prospectively does not qualify a study as prospective, though this misconception is quite common. A follow-up period is necessary for a study to be classified as prospective, though the follow-up itself can be based on past data. For example, records of hernia cases operated on using two methods three years ago can be examined to determine outcome. if follow-up information is available in the records or can be obtained now. In this case, the operative method is the antecedent, and patient satisfaction may serve as the outcome. A related study type is a cohort study, where a fixed group of individuals is followed-up. When follow-up occurs at different points in time for each participants, it is known as a longitudinal study and repeated measures when the time points of follow-up are the same for all the subjects.

Prospective studies follow a natural sequence from antecedent to outcome and can yield reliable estimates of incidence or risk, relative risk, and duration-based metrics (such as survival or hospital stay), assuming all other factors are controlled. Thus, the exposed and unexposed groups must be similar, assuming all other factors are controlled. If this is not done, exposed and unexposed groups may be affected by other factors, which should be accounted for during data analysis. For instance, in the previously mentioned example, patient survival could be affected by their pre-operative nutritional status and the level of love and care received from their attendants. Prospective studies are useful for evaluating the predictive performance of a test or model. However, they are less appropriate for rare outcomes, as only a few participants may develop the condition. While follow-up in prospective studies can be resource-intensive, the data collected is typically more accurate as it is collected in real-time.

Retrospective studies

Technically, a study based on records of past cases is not necessarily retrospective, although this term is often used in some contexts. A true retrospective study involves selecting participants with and without the disease of interest and examining their antecedents, making it essentially a case-control study. In this study type, participants are chosen based on known outcomes. For example, individuals with malignant and benign tumors can be questioned about their medical history to identify factors potentially linked to malignancy. Desai et al.[13] used this approach, selecting 100 cases of leptospirosis and 300 controls from the same neighborhood to investigate disease-associated factors; this qualifies as a retrospective study. While studies of past records for current cases are retrospective, as they move from cases to antecedents, not all studies based on past records are retrospective. For instance, a study on hernia cases mentioned in the preceding section is prospective is nature, despite using records of past cases. Retrospective studies are relatively easy and cost-effective to conduct, and they can start with a sufficient number of cases to ensure adequate statistical power. However, data on past exposure may be biased, as some participants might suppress the truth or fail to recall details accurately. This format can obscure confounders, potentially compromising the validity of the results. In retrospective studies, the relative importance of different antecedents is measured by the odds ratio, rather than the relative risk. Additionally, indices such as sensitivity, specificity of a test, and the ROC curve can be obtained.

Sometimes, in prospective studies, cases with and without the outcome of interest are re-examined to investigate antecedent characteristics that were not initially assessed. This type of study is known as a nested case-control study.

Cross-sectional studies

In some situations, a cross-sectional study is more convenient. This term should be used exclusively for studies that investigate antecedent-outcome relationships, rather than for descriptive studies —an error that is commonly made. In these studies, antecedents and outcomes are simultaneously assessed. For example, to study the effect of obesity (antecedent) on diabetes control (outcome) in individuals aged 50 years or older, we can take a sample of 1000 individuals within this age group who have known diabetes, and assess both their obesity status and diabetes control status simultaneously. This approach may be relatively easier than selecting obese individuals with diabetes and monitoring their diabetes control status over time (as in a prospective study) or selecting individuals based on their diabetes control status and reviewing their obesity status retrospectively. However, in a cross-sectional study the sample must be unbiased, preferably random, to ensure that the antecedents and outcomes are represented in the same proportions as in the target population. Tripathi et al.[14] reported a cross-sectional study on the effect of awareness about antiglaucoma drug treatment on medication adherence. Data analysis in cross-sectional studies can involve calculating the odds ratio or using correlation or agreement methods, such as percentage agreement within the clinical tolerance for quantitative data and Cohen’s kappa for qualitative data. The details of these methods will be discussed in a subsequent article in this series.

To summarize, use a descriptive study to profile the characteristics of participants. To determine the effect of an intervention, use trials when feasible. If a trial is not feasible within the available resources, opt for an observational study. A prospective format is preferable when possible; however, if follow-up is challenging, a retrospective format may be used instead. Cross-sectional studies can be carried out when an unbiased random sample can be selected from the target population. For easy identification of the design, please refer to the list in Table 1.

This article describes the designs used in most medical studies; however, some studies do not strictly conform to these traditional formats. Examples include meta-analyses and methodological studies. For further details, please refer to the book by Indrayan and Malhotra.[11]

Next: Types of data and data collation

Conflicts of interest

There are no conflicts of interest.

Funding Statement

Nil

References

  • 1.Karikó K, Buckstein M, Ni H, Weissman D. Suppression of RNA recognition by Toll-like receptors: The impact of nucleoside modification and the evolutionary origin of RNA. Immunity. 2005;23:165–75. doi: 10.1016/j.immuni.2005.06.008. [DOI] [PubMed] [Google Scholar]
  • 2.Wielgus K, Danielewski M, Walkowiak J. Svante Pääbo, reader of the neanderthal genome. Acta Physiol (Oxf) 2023;237:e13902. doi: 10.1111/apha.13902. doi:10.1111/apha.13902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Indrayan A. Aleatory and epistemic uncertainties can completely derail medical research results. J Postgrad Med. 2020;66:94–8. doi: 10.4103/jpgm.JPGM_585_19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Rivara FP. What I have learned in the last 24 years being Editor-in-Chief. JAMA Pediatr. 2024 doi: 10.1001/jamapediatrics.2024.3288. doi:10.1001/jamapediatrics.2024.3288. [DOI] [PubMed] [Google Scholar]
  • 5.Hulley SB, Cummings SR, Browner WS, Grady DG, Newman TB. 4th ed. Lippincott Williams and Wilkins; 2007. Designing Clinical Research. [Google Scholar]
  • 6.Sık N, Arslan G, Akca Çağlar A, Ülgen Tekerek N, Fidancı İ, Tolu Kendir Ö, et al. The use of point-of-care ultrasound in pediatric emergency departments and intensive care units: A descriptive study from Turkey. Pediatr Emerg Care. 2024;40:796–800. doi: 10.1097/PEC.0000000000003252. [DOI] [PubMed] [Google Scholar]
  • 7.Alharbi B, Ahmed M. Epidemiological mapping of cutaneous leishmaniasis in Saudi Arabia: An observational descriptive study. J Epidemiol Glob Health. 2024;14:1281–8. doi: 10.1007/s44197-024-00285-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Flávio-Reis VHP, Pessoa-Gonçalves YM, Barbosa AC, Desidério CS, Rodrigues WF, Oliveira CJF. Maternal deaths caused by eclampsia in Brazil: A descriptive study from 2000 to 2021. Rev Bras Ginecol Obstet. 2024;46:e-rbgo65. doi: 10.61622/rbgo/2024rbgo65. doi:10.61622/rbgo/2024rbgo65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Narayanan SN, Kumar RS, Potu BK, Nayak S, Mailankot M. Spatial memory performance of Wistar rats exposed to mobile phone. Clinics (Sao Paulo) 2009;64:231–4. doi: 10.1590/S1807-59322009000300014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Watts G. Nobel prize is awarded to doctors who discovered H pylori. BMJ. 2005;331:795. doi: 10.1136/bmj.331.7520.795. doi:10.1136/bmj. 331.7520.795. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Indrayan A, Malhotra RK. 4th ed. Taylor and Francis Group, CRC Press; USA: 2018. Medical Biostatistics. [Google Scholar]
  • 12.Kaushal-Deep SM, Ahmad R, Lodhi M, Chana RS. A prospective study of evaluation of operative duration as a predictor of mortality in pediatric emergency surgery: Concept of 100 minutes laparotomy in resource-limited setting. J Postgrad Med. 2019;65:24–32. doi: 10.4103/jpgm.JPGM_52_18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Desai KT, Patel F, Patel PB, Nayak S, Patel NB, Bansal RK. A case-control study of epidemiological factors associated with leptospirosis in South Gujarat region. J Postgrad Med. 2016;62:223–27. doi: 10.4103/0022-3859.188551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Tripathi RK, Shah A, Jalgaonkar SV, Kerkar S. Evaluation of antiglaucoma drug treatment awareness and patient-reported medication adherence: Determinants of glaucoma management. J Postgrad Med. 2023;69:146–52. doi: 10.4103/jpgm.jpgm_905_22. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Postgraduate Medicine are provided here courtesy of Wolters Kluwer -- Medknow Publications

RESOURCES