Skip to main content
BMC Medicine logoLink to BMC Medicine
letter
. 2026 Apr 9;24:218. doi: 10.1186/s12916-026-04820-0

Response to: Matters Arising in relation to “Evaluating agreement between individual nutrition randomised controlled trials and cohort studies – a meta-epidemiological study”

Julia Stadelmaier 1,, Gina Bantle 1, Lea Gorenflo 1, Eva Kiesswetter 1, Adriani Nikolakopoulou 2,3, Lukas Schwingshackl 1
PMCID: PMC13063864  PMID: 41957603

Abstract

We respond to the Matters Arising article by Calkins et al. commenting on our meta-epidemiological study “Evaluating agreement between individual nutrition randomised controlled trials and cohort studies”. We appreciate the opportunity to respond to the points raised and to clarify the methods and interpretation of our work.

Keywords: Meta-epidemiological study, Nutrition, Cohort studies, Randomised controlled trials


We thank Calkins and colleagues for their thoughtful comments [1] on our meta-epidemiological study [2] and for their interest in re-analysing our data. We appreciate the opportunity to respond to the points raised and to clarify the methods and interpretation of our work.

Interpretation of the ratio of risk ratios

We agree that our findings on the agreement between randomised controlled trials (RCTs) and cohort studies should not be interpreted causally. Accordingly, we intentionally avoided causal language throughout the manuscript [2]. Our meta-research study is observational in nature and aims to describe and quantify associations across matched RCT-cohort pairs. By design, it does not permit causal inference [3, 4], and in particular, no biological confirmation can be inferred from the ratio of risk ratios (RRR). The pooled RRR was therefore used as a descriptive metric to assess whether effect estimates systematically deviated from 1.00 across study pairs, and whether such deviations were associated with PI/ECO (population, intervention/exposure, comparison, outcome) similarity or risk-of-bias (RoB) judgements.

We contend that Calkins and colleagues place disproportionate emphasis on the pooled RRR and do not fully reflect the overall conclusions of our meta-epidemiological study. In particular, their statement that we “interpreted a pooled RRR of 1.00 as evidence that most findings from individual RCTs were confirmed by cohort studies” does not accurately reflect our interpretation. Our conclusion was based on the complete set of analyses conducted in the study, rather than on the pooled RRR alone. Specifically, we explicitly considered the uncertainty surrounding the point estimate, assessed multiple factors, including similarity levels of PI/ECO and RoB ratings, through subgroup analyses, sensitivity analyses, and meta-regression, and discussed in detail individual study pairs that appeared to agree or disagree. The pooled RRR constituted only one component of a more comprehensive empirical assessment.

Re-analysis of study results by Calkins and colleagues

Based on their re-analysis, Calkins and colleagues argue that an RRR of 1.00 represents a “statistical artefact” driven by high variance in the underlying estimates. Their attempt to support this claim by comparing mismatched study pairs and observing a similar pooled estimate does not, however, demonstrate that our findings are invalid. At most, it shows that averaging across heterogeneous comparisons can yield a null estimate, an observation that is neither surprising nor inconsistent with our own interpretation. Indeed, what they present in their Fig. 1 is conceptually similar to what we show in Fig. 2 of our study [2], but without accounting for uncertainty, despite uncertainty being central to their critique.

PI/ECO similarity and matching of studies

Calkins and colleagues raised concerns about the appropriateness of relying solely on PI/ECO criteria for matching RCTs and observational studies. We agree that PI/ECO similarity alone does not capture all determinants relevant for comparing effect estimates across study types. Nevertheless, PI/ECO characteristics remain key for determining whether studies - irrespective of study design - are “sufficiently similar” for inclusion in evidence synthesis and meta-analysis. Beyond PI/ECO, our pairing approach also considered sample size and follow-up duration, and applied additional matching criteria (geographical location, sex, and age) when more than one cohort was a suitable match for a given RCT. We also acknowledge the shortcomings of this approach, i.e. recognising that alternative matching choices could have led to different pairings, in the limitations section.

We would also like to clarify that we did not claim that 92% of pairs were highly similar. Using a three-level classification system consistent with earlier methodological work [5, 6], we found that 71.9% of pairs (n = 46) were “similar but not identical”, indicating shared key PI/ECO characteristics alongside at least one meaningful difference in a PI/ECO component (e.g. high-risk population in RCT vs. general healthy population in cohort study). Importantly, by conducting the analysis at the individual study level, we were also able to perform closer PI/ECO matching and harmonisation of dose-specific effect estimates—an approach that goes beyond previous studies [57], and strengthens our evaluation.

Agreement between RCT and cohort study evidence

We would like to clarify that we did not claim that all study pairs were in agreement. On the contrary, we explicitly captured variability by reporting heterogeneity estimates, prediction intervals, and by investigating factors associated with variation in RRRs. Throughout the manuscript, we consistently stated that we observed “on average” no disagreement across pairs; this phrasing appears repeatedly (five times) in our study [2]. This qualifier is absent from the interpretation presented by Calkins and colleagues.

Importantly, agreement in our study was not defined solely by the pooled RRR near 1.00, but also by alignment at the individual study-pair level, as assessed through visual inspection of the direction of effect estimates and the corresponding 95% confidence intervals. Based on this assessment, we highlighted that some matched RCT-cohort pairs agreed (53/64 pairs), while others disagreed (11/64). Our aim was to shed light on potential determinants of agreement between RCTs and cohort studies, discussing differences in PI/ECO similarity, RoB assessments, as well as sample size and follow-up time. Consistent with this aim, we performed subgroup analyses by PI/ECO similarity and found that agreement was weaker and heterogeneity greater among pairs classified as “broadly similar” (Table 1), a finding that further supports our cautious interpretation. As acknowledged in our manuscript, additional determinants of agreement warrant further exploration.

To sum up, we consider that our meta-epidemiological study was conducted and interpreted with appropriate methodological caution. By systematically evaluating 64 closely PI/ECO-matched RCT-cohort pairs across diverse diet-disease associations, we showed that findings from RCTs and cohort studies are often in agreement. This conclusion is based on three observations: (1) the pooled RRR revealed no systematic deviation from 1.00 across study pairs; (2) the majority (53/64) of RCT-cohort pairs showed agreement in their results; and (3) disagreement appeared more likely in pairs with lower PI/ECO similarity, differing geographic settings and follow-up durations, and distinct RoB judgements. The public health relevance of these results lies in providing insights into the extent of agreement between findings from RCTs and observational studies, the contexts in which similar findings can be expected, and thereby informing the careful integration of observational studies in nutrition research.

Acknowledgements

Not applicable.

Abbreviations

PI/ECO

Population, intervention/exposure, comparison, outcome

RCT

Randomised controlled trial

RoB

Risk of bias

RRR

Ratio of risk ratios

Authors’ contributions

J.S. and A.N. wrote the main manuscript text. All authors reviewed the manuscript.

Funding

None.

Data availability

No datasets were generated or analysed during the current study.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Calkins M, Nunez I, Soto-Mota A. Concordance between nutritional cohorts and randomised trials: biological confirmation or statistical artefact? A re-analysis of Stadelmaier et al. BMC Med. 2026.
  • 2.Stadelmaier J, Bantle G, Gorenflo L, Kiesswetter E, Nikolakopoulou A, Schwingshackl L. Evaluating agreement between individual nutrition randomised controlled trials and cohort studies - a meta-epidemiological study. BMC Med. 2025;23(1):36. 10.1186/s12916-025-03860-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Christensen R, Berthelsen DB. Controversy and debate on meta-epidemiology. Paper 3: causal inference from meta-epidemiology: a reasonable goal, or wishful thinking? J Clin Epidemiol. 2020;123:131–2. 10.1016/j.jclinepi.2020.03.023. [DOI] [PubMed] [Google Scholar]
  • 4.Herbert RD. Controversy and debate on meta-epidemiology. Paper 2: meta-epidemiological studies of bias may themselves be biased. J Clin Epidemiol. 2020;123:127–30. 10.1016/j.jclinepi.2020.03.024. [DOI] [PubMed] [Google Scholar]
  • 5.Schwingshackl L, Balduzzi S, Beyerbach J, Brockelmann N, Werner SS, Zahringer J, et al. Evaluating agreement between bodies of evidence from randomised controlled trials and cohort studies in nutrition research: meta-epidemiological study. BMJ. 2021;374:n1864. 10.1136/bmj.n1864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Stadelmaier J, Beyerbach J, Roux I, Harms L, Eble J, Nikolakopoulou A, et al. Evaluating agreement between evidence from randomised controlled trials and cohort studies in nutrition: a meta-research replication study. Eur J Epidemiol. 2024;39(4):363–78. 10.1007/s10654-023-01058-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Beyerbach J, Stadelmaier J, Hoffmann G, Balduzzi S, Bröckelmann N, Schwingshackl L. Evaluating concordance of bodies of evidence from randomized controlled trials, dietary intake, and biomarkers of intake in cohort studies: a meta-epidemiological study. Adv Nutr. 2022;13(1):48–65. 10.1093/advances/nmab095. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No datasets were generated or analysed during the current study.


Articles from BMC Medicine are provided here courtesy of BMC

RESOURCES