Skip to main content
PLOS One logoLink to PLOS One
. 2023 Dec 15;18(12):e0295915. doi: 10.1371/journal.pone.0295915

Untangling the effects of multiple exposures with a common reference group in an epidemiologic study: A practical revisit

Robert E Fontaine 1, Yulei He 2, Bao-Ping Zhu 3,*
Editor: Awatif Abid Al-Judaibi4
PMCID: PMC10723729  PMID: 38100505

Abstract

When assessing multiple exposures in epidemiologic studies, epidemiologists often use multivariable regression models with main effects only to control for confounding. This method can mask the true effects of individual exposures, potentially leading to wrong conclusions. We revisited a simple, practical, and often overlooked approach to untangle effects of the exposures of interest, in which the combinations of all levels of the exposures of interest are recoded into a single, multicategory variable. One category, usually the absence of all exposures of interest, is selected as the common reference group (CRG). All other categories representing individual and joint exposures are then compared to the CRG using indicator variables in a regression model or in a 2×2 contingency table analysis. Using real data examples, we showed that using the CRG analysis results in estimates of individual and joint effects that are mutually comparable and free of each other’s confounding effects, yielding a clear, accurate, intuitive, and simple summarization of epidemiologic study findings involving multiple exposures of interest.

Introduction

Epidemiologists frequently use case-control, retrospective cohort, and cross-sectional studies to assess the effects of multiple exposures on a disease in a single study. When analyzing data from such studies, researchers frequently start by comparing each group exposed to an exposure of interest with the remainder without that same exposure. However, such analyses lead to the “shifting reference group” problem [1], where the effect measures (e.g., risk or odds ratios) for the exposures are not mutually comparable because the denominators differ. The effect estimates also trend toward the null value because each exposure is compared to a mixture of other exposures. To address problems with multiple exposures, researchers often use a multivariable regression model to control for confounding. However, this approach does not resolve the problem with shifting reference group, contrary to common belief.

Conversely, using a common reference group (CRG) for each individual and joint exposure cleanses the problems of shifting reference groups. The CRG may be the group lacking all exposures of interest, the one with the lowest risk, or one selected by the analyst to be the appropriate reference standard [2]. Once the CRG has been chosen, all joint and individual effects are separated, and each is individually compared to the CRG.

Use of a CRG when comparing the individual effects of two or more exposures has been mentioned in several textbooks, without further elaboration [15]. Some texts limit the discussion to assessing interaction [68]. Similarly, while several journal articles since 2000 have provided coverage on the use of a CRG in the context of determining additive interaction [814], we could find only one that discussed estimating, comparing, or disentangling the individual effects in the context of gene-environment interaction [14]. A review of 138 studies published from 2001 to 2007 showed that 89% did not apply a CRG to estimate the individual and joint effects when multiple exposures were involved [13]. Continued underuse of the CRG may lead to missed opportunities to identify important exposures as well as misleading quantification and comparison of effect estimates [10, 13, 14].

In our practice, we frequently see epidemiologic analyses with the problem of the shifting reference group, potentially leading to misunderstanding and misinterpretation of results. Accordingly, in this paper we aim to provide a practical, straightforward illustration of the CRG approach. We provide real data examples, lay out practical approaches to data analysis, provide an intuitive illustration of the underlying methodological issues, and discuss advantages and caveats in using the CRG approach. We advocate for the use of the CRG as a standard approach in analyzing epidemiologic studies involving multiple exposures.

Materials and methods

Typical approach in epidemiologic data analysis

Suppose in a study there is a binary outcome Y, with values of 1 or 0 (e.g., Y = 1 for being ill and Y = 0 for being well). There are also three binary exposures, X1, X2, and X3, each taking on values 1 or 0 (exposed vs. nonexposed). Typically, the epidemiologic analysis is conducted as follows.

Step 1. Univariate analysis: The relationships, Y vs. X1, Y vs. X2, and Y vs. X3, are evaluated separately, using either three 2×2 tables or univariate logistic regression models.

Logit(P(Y=1))=βo1+β1X1,
Logit(P(Y=1))=βo2+β2X2,
Logit(P(Y=1))=βo3+β3X3,

where Logit(P(Y = 1)) is the logit link function for modeling the probability of Y = 1 (e.g., being ill); βo is the intercept of the regression line; βi (for i = 1,2,3) is the regression coefficient corresponding to the ith exposure factor, Xi.

In a 2×2 table or univariate regression model, the group with exposure to X1, (X1 = 1), is compared to a reference group consisting of those with X1 = 0 and a mixture of those with X2 = 0, X2 = 1, X3 = 0, and X3 = 1. Similarly, the group with exposure to X2 (X2 = 1), is compared to a reference group consisting of those with X2 = 0 and a mixture of those with X1 = 0, X1 = 1, X3 = 0, and X3 = 1; and so on. Hence each effect estimate has a different (“shifting”) reference group. Since each reference group contains the other exposures, the effect measures will be biased toward the null. This is akin to an experiment with multiple experimental arms, in which the researcher compares an experimental arm to the combination of other experimental arms plus the control arm, instead of comparing each individual experimental arm to the control arm.

Step 2. Multivariate analysis with main effects: Suppose X1, X2 and X3 from step 1 are all associated with outcome, Y. The main-effect multivariable regression model is often used to control for confounding of those risk factors simultaneously:

Logit(P(Y=1))=βo+β1X1+β2X2+β3X3

Epidemiologists commonly assume that the multivariable regression model takes care of the shifting reference group problem, when in fact, it does not. For example, from the main-effect logistic regression model, the adjusted odds ratio (ORadj) associated with X1 = 1 vs. X1 = 0, ORadj (X1 = 1 vs. X1 = 0), controlling for X2 and X3, is calculated as follows:

ORadj(X1=1vs.X1=0)=exp(βo+β11+β2X2+β3X3)exp(βo+β10+β2X2+β3X3)=exp(β1).

Similarly, ORadj (X2 = 1 vs. X2 = 0) and ORadj (X3 = 1 vs. X3 = 0), controlling for other exposures, are calculated as follows:

ORadj(X2=1vs.X2=0)=exp(βo+β21+β1X1+β3X3)exp(βo+β20+β1X1+β3X3)=exp(β2),
ORadj(X3=1vs.X3=0)=exp(βo+β31+β1X1+β2X2)exp(βo+β30+β1X1+β2X2)=exp(β3).

Notice the denominators, i.e., exp(βo+β10+β2X2+β3X3), exp(βo+β20+β1X1+β3X3) and exp(βo+β30+β1X1+β2X2), differ from each other in terms of exposures; hence the problem of shifting reference group persists. The same can be extended to other generalized linear models (GLMs) for different types of outcome variables where there are at least two exposures. This point will be further illustrated in the numeric examples section.

Several consequences result from the shifting reference group problem. First, the effect measures are no longer comparable to one another because each has a different reference level. Second, the shifting reference group frequently biases association measures toward the null [1], hence reducing the probability of identifying a statistically significant exposure and lessening the magnitude of the effect estimate. Third, patterns of interest in the joint effect of two or more variables may be missed.

The common reference group (CRG) approach

The CRG approach should be considered when two or more exposures, X1, X2, X3… may be relevant in an investigation. This may arise from the hypothesis, the study question, or a priori knowledge of the exposure-disease relationship. During an epidemiologic study, the interplay of multiple exposures may first be noticed during the initial descriptive analysis. For example, foodborne disease outbreak investigations often begin with the observations that only people eating meals at the same time and place became ill, and that several foods have relatively weak associations with illness that have or approach statistical significance. A CRG approach would be needed to untangle the effects of these foods. Even when the effect measures are high, a CRG approach will still help sort them out by degree of importance.

The next step involves recoding the exposure variables, X1, X2, X3, into a single, new exposure variable, Z, the values of which are a combination of values of exposure variables, X1, X2, and X3 (Table 1). The recoding can be done using a spreadsheet, database, or statistical software.

Table 1. Illustrative example of creating a common reference group (CRG) by recoding exposures, X1, X2, and X3.

Group X 1 X 2 X 3 Factors exposed to Z
A 1 1 1 X1, X2, and X3 8
B 1 1 0 X1, X2 but not X3 7
C 0 1 1 X2, X3 but not X1 6
D 1 0 1 X1, X3 but not X2 5
E 1 0 0 X1, but not X2, X3 4
F 0 1 0 X2, but not X1, X3 3
G 0 0 1 X3, but not X1, X2 2
H (CRG) 0 0 0 None of the three 1

In the analysis, variable Z is used instead of variables X1, X2, and X3. Using the standard 2×2 table analysis, each group, A–G, is individually compared with the CRG, Group H (i.e., A vs. H, B vs. H, etc.). One may also use a regression analysis such as logistic regression, in which variable Z is recoded into indicator variables, and Group H serves as the CRG. For large samples, both methods yield mathematically identical results; for small samples, however, regression analyses may result in matrix tolerance problems due to zero or sparse cells.

Once recoded, the effect measure can be computed by comparing each of the exposure groups, A–G, to the CRG, H (i.e., A vs. H, B vs. H, etc.), using a 2×2 table analyses. Alternatively, one may use logistic regression (or other GLMs) with variable Z being recoded to indicator variables.

The CRG may be any logically appropriate group chosen by the epidemiologist. Usually, it is the group with absence of the component exposures [4], i.e., group H in Table 1. However, not all situations have an absence-of-exposure group. For example, when the suspected exposure is in the drinking water during an outbreak, the CRG cannot be a group that did not drink any water. Instead, the CRG can be the exposure group with the lowest risk in a univariate analysis [15, 16]. When assessing the effectiveness of multiple protective measures (e.g., facemask use, handwashing, air filtration, physical distancing) for preventing respiratory infection, the group with exposure to the infection and without any of these protective measures may serve as the CRG. Alternatively, one may compute risk scores using the number of protective measures and create a group with the lowest risk score as the CRG. Groups with higher risk scores can be compared to the CRG to assess the dose-response relationship. Examples of these uses of CRG approach include an investigation of risk factors for influenza [17], and a study of the preventive effect of handwashing on hand, foot, and mouth disease and herpangina [18].

A mathematically equivalent approach to the CRG is to force all possible interaction terms into a saturated multiple regression model [19]. This approach has practical difficulties and drawbacks, which are discussed later in this paper. Finally, adjusting for potential confounding effects of other variables, such as age or socioeconomic status, may still be needed in the data analysis.

Results

Here we use two data examples to show how using a CRG can lead to richer and more illuminating study findings.

Ebola virus disease outbreak

Investigation of a large Ebola virus disease outbreak in Zaire in 1976 provides a simple example of two exposures of interest [20, 21]. From the initial descriptive epidemiologic analysis, the investigators hypothesized that there were two distinct modes of transmission: hospital (hospital worker, visitor, or patient), and community (person-to-person contact in communities). The investigators conducted a well powered case-control study (318 cases and 318 controls) to confirm this hypothesis.

Univariate analysis of the data (S1 File, first tab, available in the Instructor’s Guide of the TEPHINET case study [21]) showed that the disease was highly significantly associated with exposures to both the hospital (crude OR = 7.5, 95% CI: 4.8–12) and the community cases (crude OR = 15, 95% CI: 8.5–23). The adjusted ORs from a main-effects logistic regression model increased moderately for both hospital (adjusted OR = 29, 95% CI: 18–48) and community (adjusted OR = 19, 95% CI: 11–32) exposures (Table 2).

Table 2. Case-control study of risk factors for Ebola virus disease, Zaire, 1976: Crude analysis, main-effect logistic regression, and common reference group (CRG) analysis.

Univariate and Main-Effect Logistic Regression Analysis CRG Analysis
Exposure* Cases (N = 318) Controls (N = 318) OR crude OR adj Hosp Comm Cases (N = 318) Controls (N = 318) OR CRG
(95% CI) (95% CI) (95% CI)§
Hosp (+) 128 26 7.5 (4.8–12) 29 (18–48) (+) (+) 43 4 70 (24–205)
Hosp (-) 190 292 Ref Ref (+) (-) 85 22 25 (14–44)
Comm (+) 192 30 15 (9.5–23) 19 (11–32) (-) (+) 149 26 37 (22–63)
Comm (-) 126 288 Ref Ref (-) (-) 41 266 Ref

OR = odds ratio, CI = confidence interval, NC = not calculated; Ref = reference level.

* Exposure (+): Exposed; (-): Not exposed.

† Crude odds ratio (OR) and 95% confidence interval (CI) from the univariate analysis.

‡ Adjusted OR and 95% CI from the main-effect logistic regression analysis.

§ OR and 95% CI from 2×2 table analysis using a common reference group.

‖ Hosp: hospital exposure, i.e., a hospital worker, visitor, or patient.

¶ Comm: community exposure, i.e., any person with face-to-face exposure to a suspected hemorrhagic fever case in the community.

However, using those with no hospital or community exposures as the CRG, the OR for hospital-only exposure was 25, that for community-only exposure was 37, whereas that for those with both hospital and community exposures was 70 (Table 2). These ORs illustrated both the individual and joint effects of hospital and community exposures. Of note, a saturated logistic regression model would give the same ORs as the 2×4 table (not shown).

Cholera in a refugee camp

Epidemiologists investigated an acute cholera outbreak in a refugee settlement in Uganda during 2018 to determine if drinking water was the source [15]. The refugees had three main sources of water: a stream running through the camp, a government-managed water storage tank, and a spring of groundwater. From both the univariate analysis and the main-effect logistic regression analysis of the case-control study data (S1 File, second tab), the ORs associated with drinking the stream water and tank water ranged from 2.2 to 2.5, and an inverse association was found with drinking the spring water (Table 3).

Table 3. Case-control study of risk factors for cholera, Uganda, 2018: The comparison of the crude analysis, main-effect logistic regression, and common reference group analysis.

Univariate and Main-Effects Logistic Regression Analysis Common Reference Group Analysis
Exp* Cases (N = 73) Controls (N = 107) OR crude OR adj Spr* Tank* Str* Cases (N = 73) Controls (N = 107) OR CRG
(95% CI) (95% CI) (95% CI)§
Str(+) 48 46 2.5 (1.4–4.7) 2.2 (1.2–4.1) (+) (+) (+) 1 0 NC
Str(-) 25 61 Ref Ref (+) (+) (-) 0 11 0 (0–60)
(-) (+) (+) 39 36 17 (2.2–137)
Tank(+) 64 80 2.4 (1.1–5.5) 2.5 (1.1–5.9) (+) (-) (+) 0 1 NC
Tank(-) 9 27 Ref Ref (+) (-) (-) 0 1 NC
(-) (-) (+) 8 9 14 (1.5–133)
Spr(+) 1 13 0.10 (0.013–0.79) 0.13 (0.016–1.0) (-) (+) (-) 24 33 12 (1.4–94)
Spr(-) 72 94 Ref Ref (-) (-) (-) 1 16 CRG

Exp = exposure, Str = stream, Spr = spring, OR = odds ratio, CI = confidence interval, CRG = common reference group, NC = not calculated; Ref = reference level.

* (+): Exposed; (-): Non-exposed.

† Crude odds ratio (OR) and 95% confidence interval (CI) from the univariate analysis.

‡ Adjusted OR and 95% CI from the main-effect logistic regression analysis.

§ OR and 95% CI from 2×2 table analysis using a common reference group.

‖ Fisher’s exact confidence interval.

¶ Common reference group: Those who drank boiled, bottled, or rainwater

Using the CRG approach with eight combinations of water sources, seven were compared individually against the CRG (the eighth) of only drinking boiled, bottled, or rainwater. The analysis revealed much stronger and significant associations of cholera with drinking water from both the tank and the stream (OR = 17), the stream only (OR = 14), and from the tank only (OR = 12), compared with the univariate and main-effect logistic regression analyses (Table 3). This finding prompted further investigation, which revealed that the tank water was not drawn from the municipal drinking water system as contracted; rather, it was taken from a nearby lake. A cholera outbreak was ongoing in the lakeshore villages, where open defecation was common. Although unproven with microbiological methods, investigators suspected that the contaminated lake water, trucked to the camp storage tank introduced cholera into the camp. The unprotected stream likely was secondarily contaminated after the outbreak began. This finding more accurately informed public health interventions for controlling the outbreak.

Note that with a 2×8 table, three interaction cells were sparsely populated. The respective ORs for these cells were not calculated, but the cells were left in the table to maintain the segregation of the respective data from the other terms and assure untangling of the effect estimates for the individual sources. A saturated logistic regression model could not be fitted due to matrix tolerance being exceeded.

Discussion

We have shown that, when multiple exposures are evaluated in an epidemiologic study, using CRG analysis leads to more accurate understanding of the relationship between the exposures and outcome. A major utility of the CRG approach is that all individual and joint effect measures are explicitly presented [10, 11, 13, 14]. They are all untangled from each other and mutually comparable. In the data examples provided above, one can observe the dramatic increases in the effect measures for competing exposures when using the CRG analysis. We have shown the theoretical argument that in regression modeling the denominators for individual exposures differ and thus, the magnitudes of their individual ORs are not comparable to each other.

The CRG analysis can be conducted in three ways that yield identical effect estimates. The first two are based on recoding the combinations of exposure levels into a 2×N table, followed by either the 2×2 table analyses in which each combination of the exposure levels is compared with the CRG, or regression modeling of the indicator variable representing combinations of the exposure levels with the CRG serving as the reference group.

The third is to use a saturated regression model, with all possible interaction terms being forced into the model regardless of their statistical significance. However, saturated regression models are rarely used in practice for several reaons. The complexity of the saturated model increases beyond two-way interactions, requiring tedious additional effort to calculate the individual and joint effects. Increasing number of interaction terms can lead to sparsely populated cells, causing the regression model to fail due to matrix tolerance being exceeded; yet even when the interaction terms are not statistically significant or otherwise inconsequential, they need to be kept in the regression model for correct calculation of the individual and joint effects. Finally, the resultant effect measures from the saturated models are limited to assessment of multiplicative effects, yet most literature or textbooks we have surveyed that cover this topic propose the CRG (or analogous terminology) for assessing additive interaction without stressing the importance of accurately estimating the individual effects [614]. In contrast to the saturated regression model, the recoding provides measures of effect and impact that are intuitively interpretable as individual and joint effects. For these reasons, it is more straightfoward and convenient to present and interprete the results using the 2×N layout.

Using the CRG simulates an experiment with different intervention arms compared to a common control arm. The CRG thus provides direct estimates of the effects of individual exposures (e.g., vaccination or using personal protective equipment alone) and the effectiveness of combined measures (i.e., vaccination and using personal protective equipment).

The CRG analysis controls for confounding among the exposures of interest by restricting the analysis to one exposure at a time, thus the effects of different exposures are untangled. If one wants to consider other confounding variables, adjustment can be made by stratifying the CRG analysis by those confounders. A Mantel-Haenszel test may be applied to each comparison from the 2×N table [2]. To adjust for several confounders, the recoded single exposure variable may be used as an indicator variable in a multivariate logistic regression model and the adjustments made by including the confounders in the model.

The choice of the CRG might require additional consideration in certain situations. In the cholera outbreak example, the CRG could not be those who drank no water since everyone had to drink water. As such, not getting drinking water from any of the three main sources at the camp was chosen as the CRG [15]. Another common situation arises in food-borne disease outbreak investigations when the causative agent is distributed among several foods. This could happen from a common ingredient, cross-contamination during preparation, an infected food handler, or an infected customer at a self-serve buffet. The outbreak may be associated with several contaminated foods that may be apparently unrelated. Each will have unimpressive effect measures, some of which may not be statistically significant. The investigator could erroneously dismiss these food items as unimportant, hence missing important insights about the outbreak. In such situations, a reasonable approach might involve selecting several foods with the highest risk ratios and using the rest of the foods together as the CRG.

Recoding several exposures into a single variable can lead to sparsely populated cells. Zero disease in the CRG will lead to uninformative effective estimates (e.g., infinity risk ratios) for all individual and joint exposures. However, the disease prevalence alone for each individual and joint exposure will then equal the risk difference, a valid estimation of risk. With cross-sectional and retrospective cohort studies, the risk difference and attributable fraction will be available directly from the 2×N table. Case control or case base studies can also yield valid risk differences if the controls have a known sampling fraction from the population. Sparse, non-zero cells in the CRG can be handled by substituting a well populated group and inverting the resultant ORs accordingly. Sparse cells representing individual exposures are more problematic since individual effects are usually more important to estimate. The workaround for sparse cells for a joint effect could be to ignore these groups altogether as was done in the cholera investigation [15] or to use statistical techniques dealing with sparse cells, such as Firth’s logistic regression [22]. The sparse cell issue is not specific to the CRG approach. In anticipation of analyzing data using the CRG, epidemiologists may consider increasing the sample size and oversampling the CRG or other exposure groups of special interest based on the descriptive epidemiologic analysis.

In conclusion, a CRG approach provides effect measures that are easily interpreted. The CRG corrects the shifting reference group thus lending to comparability among effect estimates. Individual effects are also freed of confounding from other exposures of interest. The 2×N table display also provides intuitive interpretations on the effects of individual and joint exposures. The CRG approach may be combined with multivariate modelling for the stratification of confounding from extraneous variables. We, therefore, advocate the use of the CRG in retrospective epidemiologic studies whenever multiple exposures are known or suspected to be involved.

Supporting information

S1 File. Data sets on Ebola outbreak in Zaire (first tab) and cholera outbreak in Uganda (second tab).

(XLSX)

Acknowledgments

We thank Dr. Fred Monje and other colleagues at the Uganda Public Health Fellowship Program for letting us use the cholera outbreak investigation as an illustrative example and for agreeing to publish the associated data set.

Disclaimer: The findings and conclusions in this manuscript are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

Data Availability

All relevant data are within the paper and its Supporting Information files.

Funding Statement

The author(s) received no specific funding for this work.

References

  • 1.Rothman K. Types of epidemiologic study. In: Rothman K, ed. Modern Epidemiology. Boston, Toronto: Little Brown and Company; 1986. p. 59. [Google Scholar]
  • 2.Schlesselman JJ. Basic methods of analysis. In: Schlesselman JJ, ed. Case-Control Studies: Design, Conduct, Analysis. New York: Oxford University Press; 1982. p. 196–7. [Google Scholar]
  • 3.Dicker RC. Analyzing and Interpreting Data. In: Rasmussen S, Goodman RA, editor. The CDC Field Epidemiology Manual. New York: Oxford University Press; 2019. p. 173–4. [Google Scholar]
  • 4.Clayton D, Hills M. Many Levels of Exposure. In: Clayton D, Hills M, ed. Statistical Models in Epidemiology. Oxford: Oxford University Press; 1993. p. 158–60. [Google Scholar]
  • 5.Weiss NS, Koepsell TD. Case-Control Studies. In: Weiss NS, Koepsell TD, ed. Epidemiologic Methods: Studying the Occurrence of Illness. 2nd ed. New York: Oxford University Press; 2014. p. 351. [Google Scholar]
  • 6.Greenland S, Timothy L, Rothman K. Concepts of Interaction. In: Rothman K, Greenland S, Lash TL, ed. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008. p. 71–83. [Google Scholar]
  • 7.Gordis L. More on Causal Inferences: Bias, counfounding, and Interaction. In: Gordis Led. Epidemiology. 5th ed. Philadelphia: Elsevier Saunders; 2014. p. 270–6. [Google Scholar]
  • 8.Keyes K, Galea S. When Do Causes Work Together? In: Keyes K, Galea S, ed. Epidemiology Matters A New Introduction to Methodological Foundations. New York: Oxford Universityh Press; 2014. [Google Scholar]
  • 9.De Mutsert R, Jager, Kitty J. Zoccali Carmine, Dekker Friedo W. The effect of joint exposures: examining the presence of interaction. Kidney Int. 2009; 75(7):677–81. Epub 20090204. doi: 10.1038/ki.2008.645 . [DOI] [PubMed] [Google Scholar]
  • 10.Knol MJ, VanderWeele TJ. Recommendations for presenting analyses of effect modification and interaction. Int J Epidemiol. 2012; 41(2):514–20. Epub 20120109. doi: 10.1093/ije/dyr218 ; PubMed Central PMCID: PMC3324457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Vandenbroucke JP, von Elm E, Altman DG, Gotzsche PC, Mulrow CD, Pocock SJ, et al. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): explanation and elaboration. Epidemiology. 2007; 18(6):805–35. doi: 10.1097/EDE.0b013e3181577511 . [DOI] [PubMed] [Google Scholar]
  • 12.VanderWeele TJ, Knol MJ. A Tutorial on Interaction. Epidemiol Methods. 2014; 3(1):39. [Google Scholar]
  • 13.Knol MJ, Egger M, Scott P, Geerlings MI, Vandenbroucke JP. When one depends on the other: reporting of interaction in case-control and cohort studies. Epidemiology. 2009; 20(2):161–6. doi: 10.1097/EDE.0b013e31818f6651 . [DOI] [PubMed] [Google Scholar]
  • 14.Botto LD, Khoury MJ. Commentary: facing the challenge of gene-environment interaction: the two-by-four table and beyond. Am J Epidemiol. 2001; 153(10):1016–20. doi: 10.1093/aje/153.10.1016 . [DOI] [PubMed] [Google Scholar]
  • 15.Monje F, Ario AR, Musewa A, Bainomugisha K, Mirembe BB, Aliddeki DM, et al. A prolonged cholera outbreak caused by drinking contaminated stream water, Kyangwali refugee settlement, Hoima District, Western Uganda: 2018. Infect Dis Poverty. 2020; 9(1):1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Vighio A, Syed MA, Hussain I, Zia SM, Fatima M, Masood N, et al. et al. Risk Factors of Extensively Drug Resistant Typhoid Fever Among Children in Karachi: Case-Control Study. JMIR Public Health and Surveillance. 2021; 7(5):e27276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Liu M, Ou J, Zhang L, Shen X, Hong R, Ma H, et al. Protective effect of hand-washing and good hygienic habits against seasonal influenza: a case-control study. Medicine. 2016; 95(11):e3046. doi: 10.1097/MD.0000000000003046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Ruan F, Yang T, Ma H, Jin Y, Song S, Fontaine RE, et al. Risk factors for hand, foot, and mouth disease and herpangina and the preventive effect of hand-washing. Pediatrics. 2011; 127(4):898–904. doi: 10.1542/peds.2010-1497 [DOI] [PubMed] [Google Scholar]
  • 19.Agresti A, ed. Categorical data analysis. 2nd ed. Hoboken, NJ: John Wiley & Sons; 2002. [Google Scholar]
  • 20.World Health Organization. Ebola haemorrhagic fever in Zaire, 1976. Report of an international commission. Bull World Health Organ. 1978; 56(2):271–93. [PMC free article] [PubMed] [Google Scholar]
  • 21.Heyman D, Brown W. An outbreak of hemorrhagic fever in Africa ("Ebola"), 812-N10. [case study]. 2010. Available from: https://www.tephinet.org/tephinet-learning-center/tephinet-library/an-outbreak-of-hemorrhagic-fever-in-africa-ebola. [Google Scholar]
  • 22.Puhr R, Heinze G, Nold M, Lusa L, Geroldinger A. Firth’s logistic regression with rare events: accurate effect estimates and predictions? Stat Med. 2017; 36(14):2302–17. doi: 10.1002/sim.7273 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Maureen Musimbi Akolo

29 Aug 2023

PONE-D-23-13611Untangling the Effects of Multiple Exposures with a Common Reference Group in an Epidemiologic Study: A Practical RevisitPLOS ONE

Dear Dr. Zhu,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please work on your introduction, purpose of study and background by adding more precise literature. If possible use an editor to edit your work before resubmitting to avoid typo errors. Also see below comments from a reviewer and address them

Please submit your revised manuscript by Oct 13 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Maureen Musimbi Akolo, Ph.D

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

3. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

********** 

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

********** 

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

********** 

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

********** 

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors describe method of analyzing epidemiological data using a common reference group, this was to avoid misinterpretation of data and consequential shifting reference problem that would follow.The article is well written, easy to follow, and the language is clear even for unseasoned researcher to understand. Though, felt the introduction lacked details about the purpose and background of the review. I would recommend that the authors add more information and literature reviews to expand how the review would improve the use of common reference group in analysis of epidemiologic studies.

Overall, the paper is sound and would make an interesting addition to literature to augment epidemiological research.

I do not have the expertise to comment on the statistical approaches on this paper.

********** 

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Dec 15;18(12):e0295915. doi: 10.1371/journal.pone.0295915.r002

Author response to Decision Letter 0


14 Sep 2023

Dear Dr. Akolo,

We thank you and the reviewer for your kind reviews of our manuscript, which have helped improve the clarity of our manuscript.

We have addressed all comments you and the reviewer have made, which are detailed in our Response to the Editor and the Reviewer.

I look forward to hearing from you at your earliest convenience.

Respectfully,

Bao-Ping Zhu, MD, PhD, MS

Director, Office of Science Quality and Library Services

Office of Science

United States Centers for Disease Control and Prevention

Attachment

Submitted filename: Response to reviewers.docx

Decision Letter 1

Awatif Abid Al-Judaibi

4 Dec 2023

Untangling the Effects of Multiple Exposures with a Common Reference Group in an Epidemiologic Study: A Practical Revisit

PONE-D-23-13611R1

Dear Dr. Bao-Ping Zhu,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Awatif Abid Al-Judaibi, PhD

Academic Editor

PLOS ONE

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: The authors have integrated the reviewers feedback well. The introduction is clear and provides arguments as to why the CRG approach might be appropriate.

The examples used are understandable and the data is provided in the supplementary material.

The authors also discussed possible disadvantages to performing the CRG which is very important for studies with a small sample size.

There is one type in the Discussion section line 236 "requring" change to "requiring".

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

Acceptance letter

Awatif Abid Al-Judaibi

8 Dec 2023

PONE-D-23-13611R1

Untangling the Effects of Multiple Exposures with a Common Reference Group in an Epidemiologic Study: A Practical Revisit

Dear Dr. Zhu:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Professor Awatif Abid Al-Judaibi

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Data sets on Ebola outbreak in Zaire (first tab) and cholera outbreak in Uganda (second tab).

    (XLSX)

    Attachment

    Submitted filename: Response to reviewers.docx

    Data Availability Statement

    All relevant data are within the paper and its Supporting Information files.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES