Abstract
This cross-sectional study assesses the trends and characteristics of absolute measure reporting in highly cited medical journals from 2001 to 2019.
Controlled clinical trials, which are used to guide the decisions made by patients, clinicians, and policy makers, often only report measures of relative effect.1,2 However, absolute measures, such as the absolute risk reduction (ARR), the number needed to treat (NNT), and the number needed to harm (NNH), which measure the difference in the observed risk of an event between 2 interventions and the number of patients who need to be treated to achieve 1 additional favorable or adverse outcome, respectively, can be easier to interpret, more clinically meaningful, and less likely to exaggerate differences when outcome risk is low.3 In part because only 5% of trials published in highly cited journals before 1998 reported NNT and/or ARR,4 the Consolidated Standards of Reporting Trials (CONSORT) statement recommended that trials with binary outcomes report both relative and absolute measures.5 We assessed the recent trends and characteristics of absolute measure reporting in highly cited medical journals to determine if there have been improvements over time.
Methods
We identified the 6 most-cited medical journals according to InCites Journal Citation Reports (Clarivate Analytics 2019) (Table 1). For each journal, we reviewed all issues published in 2001, 2007, 2013, and 2019 to identify all controlled clinical trials that reported analyses testing superiority of the intervention to control and abstract-level binary outcomes, including hazard ratios. For eligible trials, we identified key study characteristics and recorded whether at least 1 abstract-level positive (P < .05) binary efficacy and/or safety outcome was reported. Next, we determined whether any NNT, NNH, and/or ARR was reported in the abstract and/or full text. For each NNT/NNH, we recorded if reporting was for primary or secondary end points and whether 95% CIs, P values, and corresponding effect estimates were provided. Fisher exact and Mann-Whitney U tests were conducted in R, version 3.4.0 (R Foundation for Statistical Computing) (2-sided P < .05). Because publicly available data were used, this study did not require ethics approval or patient consent.
Table 1. Summary of NNT, NNH, and ARR Reporting in Controlled Clinical Trials From 6 Widely Cited General Medical Journals.
Journal by year | All trials (n = 875) | Trials with a statistically significant result (n = 624)a | Trials without a statistically significant result (n = 251)a | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Total | No. (%) of trials with at least 1 reported | Total | No. (%) of trials with at least 1 reported | Total | No. (%) of trials with at least 1 reported | |||||||
NNT | NNH | ARR | NNT and/or NNH | NNT, NNH, and/or ARR | NNT and/or NNH | NNT, NNH, and/or ARR | NNT and/or NNH | NNT, NNH, and/or ARR | ||||
2001 | ||||||||||||
NEJM | 57 | 3 (5.3) | 0 | 7 (12.3) | 3 (5.3) | 9 (15.8) | 51 | 3 (5.9) | 9 (17.6) | 6 | 0 | 0 |
Lancet | 21 | 2 (9.5) | 0 | 5 (23.8) | 2 (9.5) | 6 (28.6) | 16 | 2 (12.5) | 4 (25.0) | 5 | 0 | 2 (40.0) |
JAMA | 27 | 2 (7.4) | 1 (3.7) | 2 (7.4) | 2 (7.4) | 3 (11.1) | 20 | 1 (5.0) | 2 (10.0) | 7 | 1 (14.3) | 1 (14.3) |
BMJ | 20 | 3 (15.0) | 0 | 9 (45.0) | 3 (15.0) | 10 (50.0) | 13 | 3 (23.1) | 9 (69.2) | 7 | 0 | 1 (14.3) |
JAMA IM | 10 | 2 (20.0) | 0 | 2 (20.0) | 2 (20.0) | 4 (40.0) | 7 | 2 (28.6) | 3 (42.9) | 3 | 0 | 1 (33.3) |
Annals IM | 5 | 0 | 0 | 1 (20.0) | 0 | 1 (20.0) | 5 | 0 | 1 (20.0) | 0 | 0 | 0 |
Total | 140 | 12 (8.6) | 1 (0.7) | 26 (18.6) | 12 (8.6) | 33 (23.6) | 112 | 11 (9.8) | 28 (25.0) | 28 | 1 (3.6) | 5 (17.9) |
2007 | ||||||||||||
NEJM | 72 | 5 (6.9) | 0 | 17 (23.6) | 5 (6.9) | 18 (25.0) | 61 | 5 (8.2) | 16 (26.2) | 11 | 0 | 2 (18.2) |
Lancet | 46 | 4 (8.7) | 1 (2.2) | 14 (30.4) | 4 (8.7) | 16 (34.8) | 35 | 4 (11.4) | 13 (37.1) | 11 | 0 | 3 (27.3) |
JAMA | 31 | 5 (16.1) | 0 | 5 (16.1) | 5 (16.1) | 7 (22.6) | 15 | 4 (26.7) | 5 (33.3) | 16 | 1 (6.3) | 2 (12.5) |
BMJ | 25 | 6 (24.0) | 0 | 6 (24.0) | 6 (24.0) | 8 (32.0) | 22 | 6 (27.3) | 7 (31.8) | 3 | 0 | 1 (33.3) |
JAMA IM | 20 | 0 | 0 | 4 (20.0) | 0 | 4 (20.0) | 19 | 0 | 3 (15.8) | 1 | 0 | 1 (100) |
Annals IM | 18 | 1 (5.6) | 0 | 12 (66.7) | 1 (5.6) | 12 (66.7) | 13 | 1 (7.7) | 8 (61.5) | 5 | 0 | 4 (80.0) |
Total | 212 | 21 (9.9) | 1 (0.5) | 58 (27.4) | 21 (9.9) | 65 (30.7) | 165 | 20 (12.1) | 52 (31.5) | 47 | 1 (2.1) | 13 (27.7) |
2013 | ||||||||||||
NEJM | 85 | 6 (7.1) | 0 | 15 (17.6) | 6 (7.1) | 18 (21.2) | 62 | 6 (9.7) | 15 (24.2) | 23 | 0 | 3 (20.0) |
Lancet | 64 | 4 (6.3) | 1 (1.6) | 17 (26.6) | 5 (7.8) | 22 (34.4) | 42 | 5 (11.9) | 16 (38.1) | 22 | 0 | 6 (37.5) |
JAMA | 41 | 5 (12.2) | 0 | 11 (26.8) | 5 (12.2) | 14 (34.2) | 25 | 5 (20.0) | 9 (36.0) | 16 | 0 | 5 (31.3) |
BMJ | 20 | 1 (5.0) | 0 | 4 (20.0) | 1 (5.0) | 5 (25.0) | 11 | 0 | 2 (18.2) | 9 | 1 (11.1) | 3 (33.3) |
JAMA IM | 15 | 4 (26.7) | 0 | 3 (20.0) | 4 (26.7) | 5 (33.0) | 12 | 3 (25.0) | 4 (33.3) | 3 | 1 (33.3) | 1 (33.3) |
Annals IM | 16 | 0 | 0 | 10 (62.5) | 0 | 10 (62.5) | 11 | 0 | 5 (45.5) | 5 | 0 | 5 (100) |
Total | 241 | 20 (8.3) | 1 (0.4) | 60 (24.9) | 21 (8.7) | 74 (30.7) | 163 | 19 (11.7) | 51 (31.3) | 78 | 2 (2.6) | 23 (29.5) |
2019 | ||||||||||||
NEJM | 115 | 9 (7.8) | 3 (2.6) | 25 (21.7) | 10 (8.7) | 31 (27.0) | 79 | 10 (12.7) | 25 (31.6) | 36 | 0 | 6 (16.7) |
Lancet | 86 | 10 (11.6) | 2 (2.3) | 32 (37.2) | 12 (14.0) | 38 (44.2) | 60 | 11 (18.3) | 34 (56.7) | 26 | 1 (3.8) | 4 (15.4) |
JAMA | 61 | 0 | 0 | 42 (68.9) | 0 | 42 (68.9) | 32 | 0 | 23 (71.9) | 29 | 0 | 19 (65.6) |
BMJ | 9 | 2 (22.2) | 0 | 2 (22.2) | 2 (22.2) | 3 (33.3) | 6 | 2 (33.3) | 2 (33.3) | 3 | 0 | 1 (33.3) |
JAMA IM | 7 | 2 (28.6) | 0 | 0 | 2 (28.6) | 2 (28.6) | 6 | 2 (33.3) | 2 (33.3) | 1 | 0 | 0 |
Annals IM | 4 | 0 | 0 | 4 (100) | 0 | 4 (1.0) | 1 | 0 | 1 (100) | 3 | 0 | 3 (100) |
Total | 282 | 23 (8.2) | 5 1.8) | 105 (37.2) | 26 (9.2) | 120 (42.6) | 184 | 25 (13.6) | 87 (47.4) | 98 | 1 (1.0) | 33 (33.7) |
All years | 875 | 76 (8.7) | 8 (0.9) | 249 (28.5) | 80 (9.1) | 292 (33.4) | 624 | 75 (12.0) | 218 (34.9) | 251 | 5 (2.0) | 74 (29.5) |
Abbreviations: Annals IM, Annals of Internal Medicine; ARR, absolute risk reduction; BMJ, British Medical Journal; JAMA, Journal of the American Medical Association; JAMA IM, JAMA Internal Medicine (formerly Archives of Internal Medicine); NEJM, New England Journal of Medicine; NNH, number needed to harm; NNT, number needed to treat.
At least 1 statistically significant (P < .05) finding for a binary efficacy or safety outcome or a hazard ratio reported in the abstract of a trial.
Results
We identified 875 controlled trials meeting the aforementioned criteria , of which 76 (8.7%) reported at least 1 NNT, 8 (0.9%) reported at least 1 NNH, and 249 (28.5%) reported at least 1 ARR (Table 1). In total, 292 trials (33.4%) reported at least 1 NNT, NNH, and/or ARR. A total of 80 (9.1%) reported at least 1 NNT and/or NNH, which remained relatively constant between 2001 and 2019; ARR reporting increased from 26 of 140 (18.6%) to 105 of 282 (37.2%; P < .001).
Trials in the therapeutic area of oncology had the lowest rates of reporting NNT, NNH, and/or ARR, but there were no differences by intervention tested, patient follow-up, enrollment, or funding sources (Table 2). Trials with at least 1 statistically significant end point were more likely to report an NNT/NNH than those without (75 of 624 [12.0%] vs 5 of 251 [2.0%]; P < .001).
Table 2. Comparison of Trial Characteristics, by Reporting of NNT, NNH, and ARR.
Characteristic | No. (%) | P value | |
---|---|---|---|
Trials reporting at least 1 NNT, NNH, and/or ARR (n = 292) | Trials not reporting any NNT, NNH, and/or ARR (n = 583) | ||
Therapeutic area | <.001 | ||
Cardiology | 54 (18.5) | 104 (17.8) | |
Diabetes | 10 (3.4) | 26 (4.5) | |
Infectious disease | 43 (14.7) | 85 (14.6) | |
Mood and behavior disorders | 20 (6.9) | 29 (5.0) | |
Oncology | 21 (7.2) | 106 (18.2) | |
Other | 144 (49.3) | 233 (40.0) | |
Type of intervention | .44 | ||
Drugs, biologics, or vaccines | 155 (53.1) | 317 (54.4) | |
Surgeries and procedures | 42 (14.4) | 84 (14.4) | |
Supplements and nutritional interventions | 12 (4.1) | 37 (6.3) | |
Other | 83 (28.4) | 145 (24.9) | |
Patient follow-up, median (IQR), mo | 12 (3-21) | 12 (4-24) | .04 |
Enrollment, median (IQR) | 612 (252-1811) | 660 (278-2002) | .64 |
Funding | .65 | ||
Any industry | 141 (48.3) | 282 (48.4) | |
Government/nonprofit only | 147 (50.3) | 287 (49.2) | |
Disclosed no funding received or no funding statement | 4 (1.4) | 14 (2.4) |
Abbreviations: ARR, absolute risk reduction; IQR, interquartile range; NNH, number needed to harm; NNT, number needed to treat. Trials reported were those with at least 1 statistically significant (P < .05) finding for a binary effect or safety outcome or hazard ratio reported in the abstract.
Among all 197 NNT/NNH reports, 95 (48.2%) were for primary end points and 76 (38.6%) had a 95% CI and/or P value. There were 114 NNT/NNH reports with a corresponding effect estimate reported anywhere in the text, of which 88 (77.2%) were statistically significant; 55 (48.2%) only had corresponding relative measures.
Discussion
Among 875 controlled trials with binary outcomes and/or hazard ratios published in highly cited general medical journals, fewer than one-tenth reported at least 1 NNT or NNH, but more than one-quarter reported at least 1 ARR. The majority of NNT/NNH reports were presented for statistically significant end points but without 95% CIs or P values. These findings raise concerns about persistent incomplete and selective reporting of absolute measures in controlled trials published in widely cited general medical journals.
This study may not be generalizable to all journals and fields, and we did not review supplemental materials. Although we observed a 4-fold increase in reporting NNT and a 6-fold increase in reporting ARR, compared with a prior study examining trials through 1998,4 the present findings continue to support concerns about absolute measure reporting across journals and study designs.6 Despite the limitations of the ARR, which depends on the baseline risk of the population to be treated, and NNT/NNH, which may be misinterpreted, additional explicit statements from journals on absolute measure reporting requirements and greater oversight of the CONSORT checklist can help improve reporting practices and trial interpretability.
References
- 1.Heneghan C, Goldacre B, Mahtani KR. Why clinical trial outcomes fail to translate into benefits for patients. Trials. 2017;18(1):122. doi: 10.1186/s13063-017-1870-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.King NB, Harper S, Young ME. Use of relative and absolute effect measures in reporting health inequalities: structured review. BMJ. 2012;345:e5774. doi: 10.1136/bmj.e5774 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Faraone SV Interpreting estimates of treatment effects: implications for managed care. P T. 2008;33(12):700-711. [PMC free article] [PubMed] [Google Scholar]
- 4.Nuovo J, Melnikow J, Chang D. Reporting number needed to treat and absolute risk reduction in randomized controlled trials. JAMA. 2002;287(21):2813-2814. doi: 10.1001/jama.287.21.2813 [DOI] [PubMed] [Google Scholar]
- 5.Schulz KF, Altman DG, Moher D; CONSORT Group . CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340:c332. doi: 10.1136/bmj.c332 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mendes D, Alves C, Batel-Marques F. Number needed to treat (NNT) in clinical literature: an appraisal. BMC Med. 2017;15(1):112. doi: 10.1186/s12916-017-0875-8 [DOI] [PMC free article] [PubMed] [Google Scholar]