Table 5.
Characteristics | Studies reporting characteristic, n (%) | Categories | N (%) or median (IQR) |
Type of outcome | 37 (100) | Single | 29 (78) |
Composite | 8 (22) | ||
Outcome | 37 (100) | Death | 16 (43) |
Treatment failure | 6 (16) | ||
Default, loss to follow-up or treatment interruption | 6 (16) | ||
Unfavourable outcome | 6 (16) | ||
Treatment success | 2 (6) | ||
Other* | 1 (3) | ||
Number—prevalence of outcome† | 32 (87) | – | 94 (38–171) 15% (9–-26) |
Events per candidate variable‡ | 30 (81) | – | 6 (3–-11) |
Events per variable (in final model) | 29 (78) | – | 14 (9–26) |
Predictor types | 37 (100) | Clinical/epidemiologic | 34 (92) |
Adherence | 1 (3) | ||
Biomarker | 2 (5) | ||
Analysis | 37 (100) | Logistic regression | 29 (78) |
Survival analysis | 3 (8) | ||
Machine learning | 5 (14) | ||
Method for considering predictors in multivariable models | 36 (97) | All candidate predictors | 12 (32) |
Based on unadjusted association with outcome | 19 (51) | ||
Based on clinical relevance | 1 (3) | ||
Other§ | 4 (14) | ||
Selection of predictors during modelling | 31 (84) | Full model approach | 2 (6) |
Forward selection | 7 (23) | ||
Backwards elimination | 5 (16) | ||
Stepwise selection | 8 (26) | ||
Random Forest | 1 (3) | ||
Hosmer-Lemeshow model building criteria | 4 (13) | ||
Bayesian model averaging | 3 (10) | ||
Pairwise selection | 1 (3) | ||
P value for consideration in model | 17 (46) | 0.01 | 2 (12) |
0.05 | 3 (18) | ||
0.11 | 1 (6) | ||
0.2 | 6 (35) | ||
0.25 | 5 (29) | ||
P value for retention in MV model | 20 (54) | 0.05 | 9 (45) |
0.1 | 9 (45) | ||
0.15 | 1 (5) | ||
0.2 | 1 (5) | ||
Internal validation | 19 (51) | Split-sample | 10 (53) |
Bootstrap | 5 (26) | ||
Cross-validation | 4 (21) | ||
External validation | 6 (16) | Temporal | 1 (17) |
Geographic | 1 (4) | ||
Setting | 4 (67) | ||
Calibration | 17 (46) | Calibration plot¶ | 2 (12) |
Calibration slope¶ | 1 (6) | ||
Hosmer-Lemeshow goodness of fit p value¶ | 13 (77) | ||
0.51 (0.20–0.79) | |||
Calibration table¶ | 2 (12) | ||
Mean absolute error¶ | 1 (6) | ||
Discrimination | 30 (81) | C-statistic (AUROC)¶ | 30 (100) |
0.75 (0.68–0.84) | |||
Log rank test¶ | 2 (5) | ||
Classification | 18 (49) | Sensitivity** | 14 (78) |
70(54, 78) | |||
Specificity** | 13 (72) | ||
75 (71–88) | |||
Accuracy | 2 (11) | ||
Other†† | 2 (11) | ||
Model presentation | 34 (92) | Risk score | 16 (43) |
Model coefficient | 8 (22) | ||
Nomogram | 2 (6) | ||
ORs/relative scores | 4 (12) | ||
Survey tool | 1 (3) |
*Outcome is a value from 1 to 5 (1=patient completed the treatment course in frame of DOTS, 2=cured, 3=quit treatment, 4=failed treatment and 5=death).
†Prevalence of outcome in the population used to develop the prediction model (ie, derivation/development subset if split-sample technique was used or full sample if the model was not validated or if bootstrap/cross-validation was used).
‡Only five studies report the exact number of predictors considered. Otherwise, the number of candidate predictors was estimated from the provided tables or lists of candidate predictors in the source paper.
§Other methods of determining which variables to consider for prediction model include: principal components analysis (n=1), screening for multicollinearity via correlation coefficient (n=1), one study used a combination of a priori and selection via univariable association, and the other used machine-learning preprocessing (n=1).
¶Sums to more than 100%, because some studies report multiple measures of calibration or discrimination.
**Based on the following cut-off methods: Youden (n=4) concordance probability (n=1), estimated at nearest 0,1 for studies that present a range of sensitivity and specificity in a table or figure (n=4), or unknown (n=5).
††Other includes one study that reports false positive rate and one study that includes a graph of sensitivity versus specificity.
AUROC, area under receiver operating characteristic; c-statistic, concordance statistic; TB, tuberculosis.