Skip to main content
. 2019 Sep 17;3(10):nzz104. doi: 10.1093/cdn/nzz104

TABLE 2.

Reporting of methods for selection of covariates1

2007/2008 2017/2018 All articles
(n = 53) (n = 97) (n = 150)
Reported whether covariates were selected a priori?
 Some (but not all) covariates were selected a priori 1 (1.9) 1 (1.0) 2 (1.3)
 All covariates were selected a priori 0 (0) 7 (7.2) 7 (4.7)
 Not reported 52 (98.1) 89 (91.8) 141 (94.0)
Reported methods for selection of covariates for analysis?
 Reported criteria for selection of all covariates 9 (17.0) 21 (21.7) 30 (20.0)
 Reported criteria for selection of some covariates 10 (18.9) 15 (15.5) 25 (16.7)
 Not reported 34 (64.2) 61 (62.9) 95 (63.3)
Among studies that reported methods for selection of covariates, covariates were selected from:2
 Factors known or suspected to be associated with the exposure 2 (3.8) 3 (3.0) 5 (3.3)
 Known or established risk factors for the outcome 13 (24.5) 26 (26.8) 39 (26.0)
 Factors known or suspected to be associated with both the exposure and outcome 1 (1.9) 4 (4.1) 5 (3.3)
 Factors known or suspected to be associated with either the exposure or the outcome 2 (3.8) 0 (0) 2 (1.3)
 Confounders (factors associated with the exposure that also act on the outcome) as identified by Directed Acyclic Graphs 0 (0) 4 (4.1) 4 (2.7)
 Other 4 (7.5) 2 (2.1) 6 (4.0)
 Not reported 34 (64.2) 61 (62.9) 95 (63.3)
Sources cited to support choice of covariates?2
 Systematic review 1 (1.9) 5 (5.2) 6 (4.0)
 Authoritative document (e.g., World Cancer Research Fund report) 0 (0) 4 (4.1) 4 (2.7)
 Narrative review 0 (0) 1 (1.0) 1 (0.7)
 Epidemiological study 9 (17.0) 11 (11.3) 20 (13.3)
 De novo literature search conducted by authors 1 (1.9) 9 (9.3) 10 (6.7)
 Methodology article 0 (0) 1 (1.0) 1 (0.7)
 No source cited 44 (83.0) 76 (78.4) 120 (80.0)
Reported use of data-driven methods for selection of covariates for inclusion in final analytic model?
 Reported use of data-driven methods for selection of all covariates for inclusion in final analytic model 6 (11.3) 8 (8.3) 14 (9.3)
 Reported use of a combination of data-driven and hypothesis-driven methods to select covariates for inclusion in final analytic model 11 (20.8) 15 (15.4) 26 (17.3)
 Did not report using any data-driven methods to select covariates 36 (67.9) 74 (76.3) 110 (73.3)
Among studies that reported use of data-driven methods for selection of covariates, covariates were selected based on:2
 If their inclusion appreciably changed the effect estimate of the primary exposure (change-in-estimate criterion) 11 (20.8) 11 (11.3) 22 (14.7)
P value in the final analytic model 2 (3.8) 1 (1.0) 3 (2.0)
P value in univariate model with the exposure as the dependent variable 1 (1.9) 3 (3.1) 4 (2.7)
P value in univariate model with the outcome as the dependent variable 3 (5.7) 6 (6.1) 9 (6.0)
 Backward elimination 1 (1.9) 0 (0) 1 (0.7)
 Stepwise selection 0 (0) 2 (2.1) 2 (1.3)
 Magnitude of correlation with exposure 0 (0) 1 (1.0) 1 (0.7)
 Whether inclusion reduced the SE of the effect estimate of the primary exposure 1 (1.9) 1 (1.0) 2 (1.3)
 Model fit3 0 (0) 1 (1.0) 1 (0.7)
 Some description provided but unclear which specific method was used 4 (7.5) 3 (3.1) 7 (4.7)
 Did not report using any data-driven methods to select covariates 36 (67.9) 74 (76.3) 110 (73.3)
Conducted quantitative bias analysis to evaluate impact of potential unadjusted/unmeasured confounders on results?
 Yes, according to methods described by Lin et al. (29) 0 (0) 1 (1.0) 1 (0.7)
 Yes, according to methods described by Ding and VanderWeele (30) 0 (0) 1 (1.0) 1 (0.7)
 Yes, by constructing a hypothetical confounder 1 (1.9) 0 (0) 1 (0.7)
 No 52 (98.1) 95 (97.9) 147 (98.0)

1Values are n (%).

2Categories are not mutually exclusive.

3Specific measure of model fit used not reported.