Abstract
Background
Formulas for the extraction of continuous and binary effect sizes that are entered into a meta-analysis are readily available. Only some formulas for the extraction of count outcomes have been presented previously. The purpose of this methodological article is to present formulas for extracting effect sizes and their standard errors for studies of count outcomes with person-time denominators.
Methods
Formulas for the calculation of the number of events in a study and the corresponding person time in which these events occurred are presented. These formulas are then used to estimate the relevant effect sizes and standard errors of interest. These effect sizes are rates, rate ratios and rate differences for a two-group comparison and rate ratios and rate differences for a difference-in-difference design.
Results
Two studies from the field of suicide prevention are used to demonstrate the extraction of the information required to estimate effect sizes and standard errors. In the first example, the rate ratio for a two-group comparison was 0.957 (standard error of the log rate ratio, 0.035), and the rate difference was −0.56 per 100,000 person years (standard error 0.44). In the second example, the rate ratio for a difference-in-difference analysis was 0.975 (standard error of the log rate ratio 0.036) and the rate difference was −0.30 per 100,000 person years (standard error 0.42).
Conclusions
The application of these formulas enables the calculation of effect sizes that may not have been presented in the original study. This reduces the need to exclude otherwise eligible studies from a meta-analysis, potentially reducing one source of bias.
Keywords: Metanalysis, Statistical Issues, Epidemiology
WHAT IS ALREADY KNOWN ON THIS TOPIC
Formulas are widely available to extract effect sizes for continuous and binary outcomes.
WHAT THIS STUDY ADDS
This study shows how effect sizes for count outcomes (rates, rate ratios and rate differences) and their standard errors can be calculated by estimating the number of events and the person time. Once this information is available for one or more groups, then different effect sizes can be calculated.
HOW MIGHT THIS STUDY AFFECT RESEARCH, POLICY OR PRACTICE
A potential source of bias with meta-analyses is the exclusion of eligible studies because the primary effect size of interest is not presented. The study shows how to extract information that enables the estimation of a variety of effect sizes, thus reducing a potential source of bias.
Introduction
Meta-analysis is commonly used to summarise the evidence on a topic. A key step in preparation for a meta-analysis is the extraction of effect sizes and their standard errors from eligible studies. These values are then entered into meta-analysis software either on their original scale or on a transformed scale to induce normality.1
Formulas for the extraction of effect sizes and standard errors are readily available for studies of continuous outcomes (eg, mean differences, standardised mean differences) and binary outcomes (eg, odds ratios, risk ratios, risk differences).2 3 However, information on how to extract rates from studies of the number of events with person time denominators is scattered. These designs are common in studies of injury prevention (eg, studies of suicides, transport crashes and falls) where effect sizes are often presented as rates (eg, rate ratios, rate differences).
The purpose of this methodological article is to present in one place the main formulas for extracting rates, rate ratios and rate differences with their standard errors for two-group studies (eg, exposed and unexposed individuals) and difference-in-difference studies (eg, intervention and control arms). The article draws on two studies from the suicide prevention literature to demonstrate the use of these formulas.
Methods
Rate for single group studies
For a study of a single group, let be the count of the events of interest and be the person time. The estimated rate, , is defined as
| (1) |
The natural log of the estimated rate, , is assumed to be asymptotically normally distributed and is typically used for subsequent calculations, including meta-analysis. The log rate has a standard error that is defined as
| (2) |
The 95% confidence interval of the rate is therefore
| (3) |
where and are lower and upper limits. Note that the exponential transformation in equation 3 is used to return these limits to the rate scale. If a study presents the number of events and the rate, and , but not the person time, , then the person time can be calculated using
| (4) |
Similarly, if a study reports the 95% confidence interval for the rate but not its standard error, then can be calculated approximately using the rate and the lower limit of the rate:
| (5) |
Finally, can be used to approximate the number of events if is not reported as
| (6) |
Rate ratios for two-group comparison studies
Rate ratios are commonly reported and show the relative difference between the exposed and unexposed groups. The estimated rate ratio, , is defined as
| (7) |
where the subscripts refer to the exposed and unexposed groups (with 1 denoting the exposed group, 0 the unexposed). Like the rate, rate ratios and other ratio measures such as relative risks and odds ratios are typically analysed on the log scale as they are assumed to be asymptotically normally distributed. The log rate ratio, , has standard error
| (8) |
The 95% confidence interval is then
| (9) |
The exponential transformation in equation 9 returns the limits to the rate ratio scale. It is common for studies to present the rate ratio and its 95% confidence interval but not the standard error of the rate ratio. The standard error of the log rate ratio can be calculated approximately by
| (10) |
Rate differences for two-group comparison studies
An alternative to the rate ratio is the rate difference.4 5 The rate difference, sometimes referred to as the risk difference, the excess risk and the attributable risk, is the difference in rates between two groups.4 This is estimated by
| (11) |
The rate difference has standard error
| (12) |
Note that, unlike the standard errors of rates and rate ratios, the standard error of the rate difference is not on the log scale. This is because is assumed to be asymptotically normally distributed; therefore, no transformation is required for the calculation of confidence intervals or other related statistics prior to meta-analysis. The 95% confidence interval of the rate difference is
| (13) |
If the standard error is not presented, but the rate difference and its confidence interval are, then the standard error can be approximated by
| (14) |
Rate ratios for difference-in-difference studies
The two-group comparison described above considers the difference in rates between an exposed and unexposed group. In injury prevention, an example of such a study is a comparison of the number of suicides before and after the installation of a safety barrier on a bridge. In the context of the difference-in-difference design, we refer to this comparison as the intervention arm. The difference-in-difference design extends this design by including an additional arm that acts as a control group, for example, the rates during the same before and after periods at a nearby bridge.6
The rate ratio for a difference-in-difference design, , is defined as
| (15) |
where is the estimated rate ratio in the intervention arm and is the estimated rate ratio in the control arm. The subscripts refer to the arm (1=intervention, 0=control) and where the subscript is used, refers to the level of exposure (1=exposed, 0=unexposed). Again, calculations are typically performed on the log scale. The log rate ratio, , has standard error
| (16) |
which leads to the following formula for the 95%
| (17) |
If only and its 95% confidence interval are presented, then can be approximated using the formula
| (18) |
Rate differences for difference-in-difference studies
Rate differences can also be estimated for a difference-in-difference design. Let be defined as
| (19) |
where the subscripts are defined as before. That is, the subscripts refer to the intervention and control groups (coded 1 and 0, respectively), and when the subscript is used, this refers to the exposure level (exposed and unexposed, coded 1 and 0). has standard error
| (20) |
which leads to the lower and upper limits of the 95% confidence interval
| (21) |
Finally, the standard error can be approximated if only and its confidence interval are presented
| (22) |
Results
This section illustrates the use of these formulas to extract the required information for a meta-analysis using the results from two studies. The first study, by Baran and Kropiwnicki,7 examines suicide rates in Sweden before and after the implementation of a national suicide prevention strategy in 2008. It is an example of a two-group comparison study. In the 6 years before the introduction of the strategy, was 12.90 per 100,000 (95% CI 12.39 to 13.41). In the 6 years after, was 12.34 per 100,000 (95% CI 11.69 to 12.99). The goal is to calculate the rate ratio and the log standard error, neither of which are presented in the original study. To do this, we require and as well as and . This is done by first estimating the standard error of the log rates. In the period after the strategy was introduced, the standard error of the log rate (from equation 5) is
The log standard error can then be used to estimate the number of suicides (equation 6).
That is, there were 1276 suicides in the period after the strategy was introduced. Now the person time can be estimated. From equation 4,
is the person time divided by 100,000 because the rates are per 100,000 person-years. In its original metric, the data are 10,340,000 person-years. For the period before the strategy was introduced, the corresponding values are , and (or 17,580,000 person-years). With these values calculated, it is now possible to estimate the rate ratio and the standard error of the log rate ratio using equations 7 and 8.
The final step is to then transform the rate ratio onto the log scale. This is because the commonly used meta-analysis methods assume the study-specific estimates are normally distributed. The rate ratio is not normally distributed, but its log is approximately normally distributed. Thus, . The standard error is already on the log scale, so no further transformation is needed.
If the goal of the meta-analysis is instead to analyse rate differences, then these can easily be calculated from the derived values. Using equations 11 and 12,
The rate difference is approximately normally distributed, and its standard error is on the same scale. Therefore, no further transformation of these values is needed to conduct a meta-analysis.
The second example uses data from a difference-in-difference study. Page and colleagues8 report on a national suicide prevention trial in Australia. The trial was implemented in 12 areas, and they compared population suicide rates before and after implementation. Comparable data were also available for control areas. The results of this study are summarised in table 1. Although the study already presents estimates of the rate ratio for a difference-in-difference analysis, the calculations are repeated here as a demonstration.
Table 1. Number of suicides and suicide rates from a difference-in-difference study by Page et al.
| Intervention | Control | |||
|---|---|---|---|---|
| Pre-implementation | Post-implementation | Pre-implementation | Post-implementation | |
| Number | ||||
| Rate | ||||
The study reports the number of suicides in the four groups defined by the arm (intervention and control) and exposure period (pre-implementation and post-implementation). The study also reports the corresponding rates in these groups The first step is to calculate the person time in each group, which is found using equation 4, as above. This gives , , and . It is now possible to calculate the rate ratio for the difference-in-difference, , and the standard error of the log rate ratio, , using equations 15 and 16.
is approximately the same as the value reported in the original study, . The standard error is similar (from equation 18).
The difference between the two values is likely due to rounding error. As is not normally distributed, a log transformation is required prior to meta-analysis (ie, ). No transformation is required for its standard error as it is already on the log scale. Finally, the rate difference for the difference-in-difference can also be calculated using equations 19 and 20.
The is assumed to be approximately normally distributed, so no transformation is needed to undertake a meta-analysis of this effect size.
Discussion
One of the challenging parts of undertaking a meta-analysis is extracting consistent effect sizes and their standard errors from eligible studies. In a field such as injury prevention, a set of eligible studies might include a mixture of rate ratios, rate differences and difference-in-difference rate ratios and rate differences. It is essential to be able to convert these different effect sizes onto the same metric for meta-analysis. While the conversion to a single metric is usually straightforward, the calculation of the appropriate standard error can be challenging because these are often not published in the original studies. This is an important problem. When researchers are unable to extract the required information, a study is often excluded from the meta-analysis. Frequent exclusion of otherwise eligible studies from a meta-analysis creates a potential risk of bias because not all the evidence is considered.
In this methodological article, the formulas for extracting key values for estimating commonly used effect sizes are presented. The basis for estimating the different effect sizes discussed here is the calculation of a rate for a single group. Since rates are calculated from the number of events, , and the person time, , one way to solve the problem is to find these values directly in the study or find ways of calculating these values. Equations 4, 5 and 6 are key to calculating and if this information is not presented. Once you have the s and s for each group, the calculation of the appropriate effect size and its standard error easily follows. For rate ratios for a two-group comparison, the key formulae are equations 7 and 8; for rate differences for a two-group comparison, see equations 11 and 12; for rate ratios for a difference-in-difference study, see equations 15 and 16; and for rate differences for a difference-in-difference study, see equations 19 and 20. Importantly, one of the best ways of avoiding errors when calculating estimates is to be as transparent as possible. Documenting the calculations and presenting these in a supplementary appendix to the meta-analysis is an excellent way of picking up any errors and showing the reader precisely how the calculations have been done (along with any assumptions made).
All the formulas presented here for standard errors assume the effect size is normally distributed, either on the log scale (for rates and rate ratios) or on the original scale (for rate differences). This assumption should hold when there are sufficient observations in each group but may not for small samples of rare events. Most formulas presented here will not work when there are counts of zero in one or more study groups because the log of zero is undefined. The solution to this problem is to enter the counts of events and the person time directly into a meta-analysis, modelling the rate ratios using a random effects Poisson regression model or a binomial-normal model. These approaches are covered elsewhere.1 9 10 Finally, all formulas are for unadjusted effect sizes. Depending on the study question, if the adjusted effect size and its confidence interval are presented, then a standard error can be calculated using equations 5, 10, 18 and 22. However, this relies on the effect sizes having been adjusted for similar covariates. If that is not the case, it may result in an apples-with-oranges comparison in the meta-analysis. In that vein, it is not good practice to undertake a meta-analysis of different effect sizes, for example, mixing rate ratios with odds ratios. These effect sizes are not the same, and odds ratios only approximate rate ratios when the outcome is a rare event.11
In summary, the formulas reported here enable effect sizes for rates to be calculated when they may not have been included in the original manuscript. These formulas will be most useful when some eligible studies present information on one metric—for instance, rate ratios—and other studies on another—for instance, rate differences. Being able to convert them all to the same metric will increase the number of available studies in a meta-analysis and potentially reduce the risk of bias due to excluding otherwise eligible studies.
Footnotes
Funding: Matthew Spittal is supported by a National Health and Medical Research Council Investigator Grant (GNT2025205). The funder had no role in the study’s conceptualisation, design, interpretation of results and drafting of the manuscript or the decision to submit for publication.
Patient consent for publication: Not applicable.
Ethics approval: Not applicable.
Provenance and peer review: Not commissioned; externally peer reviewed.
Patient and public involvement: Patients and/or the public were not involved in the design, conduct, reporting or dissemination plans of this research.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information.
References
- 1.Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Stat Softw. 2010;36:1–48. doi: 10.18637/jss.v036.i03. [DOI] [Google Scholar]
- 2.Higgins JPT, Thomas J, Chandler J, et al. Cochrane handbook for systematic reviews of interventions. 2nd. Glasgow: Wiley Blackwell; 2019. edn. [Google Scholar]
- 3.Lipsey MW, Wilson DB. Practical meta-analysis. Thousand Oaks, CA: Sage Publications; 2001. [Google Scholar]
- 4.Hennekens CH, Buring JE, Mayrent SL. Epidemiology in medicine. Little, Brown; 1987. [Google Scholar]
- 5.Lash TL, VanderWeele TJ, Rothman KJ, et al. Modern epidemiology .Mexico: Wolters Kluwer; 2021 [Google Scholar]
- 6.Spittal MJ, Gunnell D, Sinyor M, et al. Evaluating Population-Level Interventions and Exposures for Suicide Prevention. Crisis . 2025;46:50–5. doi: 10.1027/0227-5910/a000961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Baran A, Kropiwnicki P. Advantages and pitfalls of the Swedish National Program for Suicide Prevention 2008. Psychiatria i Psychologia Kliniczna. 2015;15:175–81. doi: 10.15557/PiPK.2015.0026. [DOI] [Google Scholar]
- 8.Page A, Pirkis J, Bandara P, et al. Early impacts of the ‘National Suicide Prevention Trial’ on trends in suicide and hospital admissions for self-harm in Australia. Aust N Z J Psychiatry . 2023;57:1384–93. doi: 10.1177/00048674231166330. [DOI] [PubMed] [Google Scholar]
- 9.Spittal MJ, Pirkis J, Gurrin LC. Meta-analysis of incidence rate data in the presence of zero events. BMC Med Res Methodol. 2015;15:42. doi: 10.1186/s12874-015-0031-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Stijnen T, Hamza TH, Ozdemir P. Random effects meta-analysis of event outcome in the framework of the generalized linear mixed model with applications in sparse data. Stat Med. 2010;29:3046–67. doi: 10.1002/sim.4040. [DOI] [PubMed] [Google Scholar]
- 11.Dettori JR, Norvell DC, Chapman JR. Risks, Rates and Odds: What’s the Difference and Why Does It Matter? Global Spine J. 2021;11:1156–8. doi: 10.1177/21925682211029640. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All data relevant to the study are included in the article or uploaded as supplementary information.
