. 2021 Mar 2;11(3):e044687. doi: 10.1136/bmjopen-2020-044687

Table 5.

Methods reported for the 37 models of the 33 included studies with prediction models for TB treatment outcomes

Characteristics	Studies reporting characteristic, n (%)	Categories	N (%) or median (IQR)
Type of outcome	37 (100)	Single	29 (78)
		Composite	8 (22)
Outcome	37 (100)	Death	16 (43)
		Treatment failure	6 (16)
		Default, loss to follow-up or treatment interruption	6 (16)
		Unfavourable outcome	6 (16)
		Treatment success	2 (6)
		Other*	1 (3)
Number—prevalence of outcome†	32 (87)	–	94 (38–171) 15% (9–-26)
Events per candidate variable‡	30 (81)	–	6 (3–-11)
Events per variable (in final model)	29 (78)	–	14 (9–26)
Predictor types	37 (100)	Clinical/epidemiologic	34 (92)
		Adherence	1 (3)
		Biomarker	2 (5)
Analysis	37 (100)	Logistic regression	29 (78)
		Survival analysis	3 (8)
		Machine learning	5 (14)
Method for considering predictors in multivariable models	36 (97)	All candidate predictors	12 (32)
		Based on unadjusted association with outcome	19 (51)
		Based on clinical relevance	1 (3)
		Other§	4 (14)
Selection of predictors during modelling	31 (84)	Full model approach	2 (6)
		Forward selection	7 (23)
		Backwards elimination	5 (16)
		Stepwise selection	8 (26)
		Random Forest	1 (3)
		Hosmer-Lemeshow model building criteria	4 (13)
		Bayesian model averaging	3 (10)
		Pairwise selection	1 (3)
P value for consideration in model	17 (46)	0.01	2 (12)
		0.05	3 (18)
		0.11	1 (6)
		0.2	6 (35)
		0.25	5 (29)
P value for retention in MV model	20 (54)	0.05	9 (45)
		0.1	9 (45)
		0.15	1 (5)
		0.2	1 (5)
Internal validation	19 (51)	Split-sample	10 (53)
		Bootstrap	5 (26)
		Cross-validation	4 (21)
External validation	6 (16)	Temporal	1 (17)
		Geographic	1 (4)
		Setting	4 (67)
Calibration	17 (46)	Calibration plot¶	2 (12)
		Calibration slope¶	1 (6)
		Hosmer-Lemeshow goodness of fit p value¶	13 (77)
		Hosmer-Lemeshow goodness of fit p value¶	0.51 (0.20–0.79)
		Calibration table¶	2 (12)
		Mean absolute error¶	1 (6)
Discrimination	30 (81)	C-statistic (AUROC)¶	30 (100)
Discrimination	30 (81)	C-statistic (AUROC)¶	0.75 (0.68–0.84)
		Log rank test¶	2 (5)
Classification	18 (49)	Sensitivity**	14 (78)
Classification	18 (49)	Sensitivity**	70(54, 78)
		Specificity**	13 (72)
		Specificity**	75 (71–88)
		Accuracy	2 (11)
		Other††	2 (11)
Model presentation	34 (92)	Risk score	16 (43)
		Model coefficient	8 (22)
		Nomogram	2 (6)
		ORs/relative scores	4 (12)
		Survey tool	1 (3)

*Outcome is a value from 1 to 5 (1=patient completed the treatment course in frame of DOTS, 2=cured, 3=quit treatment, 4=failed treatment and 5=death).

†Prevalence of outcome in the population used to develop the prediction model (ie, derivation/development subset if split-sample technique was used or full sample if the model was not validated or if bootstrap/cross-validation was used).

‡Only five studies report the exact number of predictors considered. Otherwise, the number of candidate predictors was estimated from the provided tables or lists of candidate predictors in the source paper.

§Other methods of determining which variables to consider for prediction model include: principal components analysis (n=1), screening for multicollinearity via correlation coefficient (n=1), one study used a combination of a priori and selection via univariable association, and the other used machine-learning preprocessing (n=1).

¶Sums to more than 100%, because some studies report multiple measures of calibration or discrimination.

**Based on the following cut-off methods: Youden (n=4) concordance probability (n=1), estimated at nearest 0,1 for studies that present a range of sensitivity and specificity in a table or figure (n=4), or unknown (n=5).

††Other includes one study that reports false positive rate and one study that includes a graph of sensitivity versus specificity.

AUROC, area under receiver operating characteristic; c-statistic, concordance statistic; TB, tuberculosis.