Skip to main content
. 2022 Oct 25;11:229. doi: 10.1186/s13643-022-02082-4

Table 2.

Model metrics for the internal and external validation datasets

Dataset AUC, %
(95% CI)
True positive rate sensitivity, %
(95% CI)
False positive rate
1-specificity, %
(95% CI)
Number needed to screen
(95% CI)

Internal validation

This dataset had 600 articles, with ~ 15% being CRTs

Number needed to read: 6.8a

Convolutional neural network—Word2Vec 98.2 (96.9, 99.5) 96.6 (92.0, 100) 13.9 (10.7, 17.0) 1.8 (1.6, 2.1)
Convolutional neural network—FastText 98.4 (97.3, 99.5) 89.8 (83.0, 96.6) 3.5 (2.0, 5.1) 1.2 (1.1, 1.3)
Support vector machines 97.2 (95.7, 98.8) 97.7 (94.3, 100) 19.9 (16.4, 23.2) 2.2 (1.9, 2.6)
Ensemble 98.6 (97.8, 99.4) 97.7 (94.3, 100) 15.0 (11.9, 18.2) 1.9 (1.7, 2.2)

External validation

This dataset had 1916 articles, with ~ 35% being CRTs

Number needed to read: 2.9a

Convolutional neural network—Word2Vec 97.9 (97.2, 98.6) 97.0 (95.6, 98.2) 20.8 (18.5, 23.0) 1.4 (1.3, 1.5)
Convolutional neural network—FastText 97.7 (97.0, 98.4) 91.7 (89.8, 93.8) 4.8 (3.7, 6.0) 1.1 (1.1, 1.1)
Support vector machines 96.8 (96.0, 97.6) 97.3 (96.1, 98.5) 32.2 (29.7, 34.9) 1.6 (1.6, 1.7)
Ensemble 97.8 (97.0, 98.5) 97.6 (96.4, 98.6) 21.8 (19.6, 24.1) 1.4 (1.4, 1.5)

aThe number needed to read was calculated as one divided by the % of articles that are CRTs

AUC Area under the receiver operating characteristic curve, CI Confidence interval