Skip to main content
. 2024 Nov 26;19(11):e0311358. doi: 10.1371/journal.pone.0311358

Table 2. Summary of performance measures of STEED compared with manual human ascertainment.

Specificity Sensitivity Precision Accuracy F1-score
Training corpus (motor neuron diseases, n = 45)
Species NA 96 100 96 0.98
Sex 67 85 94 82 0.89
Disease model NA 96 100 96 0.98
Outcome histology 89 92 97 91 0.94
Outcome behaviour 50 97 84 84 0.90
Outcome imaging 96 NA NA 96 NA
Randomization 84 96 89 91 0.93
Blinding 95 92 96 93 0.94
Animal welfare NA 86 97 84 0.92
Conflict of interest 100 98 100 97 0.99
Sample size calculation 78 92 63 82 0.75
ARRIVE guidelines 100 100 100 100 1.00
Data availability 85 94 94 91 0.94
Validation corpus 1 (motor neuron diseases, n = 31)
Species NA 100 100 100 1.00
Sex 100 74 100 84 0.85
Disease model NA 90 100 90 0.95
Outcome histology 100 96 100 97 0.98
Outcome behaviour 78 85 76 81 0.79
Outcome imaging NA 100 100 100 1.00
Randomization 100 86 100 97 0.92
Blinding 100 89 100 97 0.94
Animal welfare 100 89 100 90 0.94
Conflict of interest 92 94 94 94 0.94
Sample size calculation 81 80 44 81 0.57
ARRIVE guidelines 100 NA NA 100 NA
Data availability 96 83 83 94 0.83
Validation corpus 2 (multiple sclerosis, n = 244)
Species NA 75 100 75 0.86
Sex 76 83 93 82 0.88
Disease model NA 87 100 88 0.93
Outcome histology 64 96 93 91 0.95
Outcome behaviour 66 91 81 82 0.86
Outcome imaging NA 94 100 94 0.97
Randomization 93 81 75 90 0.78
Blinding 98 85 96 93 0.90
Animal welfare 86 80 95 82 0.87
Conflict of interest 96 97 90 97 0.93
Sample size calculation 94 100 27 97 0.43
ARRIVE guidelines 100 100 100 100 1.00
Data availability 100 80 80 100 0.80

Specificity, sensitivity, precision, and accuracy are denoted in percentage. For details regarding measures, please see the materials and methods section. Items reaching or exceeding our pre-defined thresholds (sensitivity of 85% and a specificity of 80%) are printed in bold font.