Skip to main content
. Author manuscript; available in PMC: 2016 Jan 31.
Published in final edited form as: J Biomed Inform. 2014 Nov 8;53:196–207. doi: 10.1016/j.jbi.2014.11.002

Table 3.

Paired classification performances (all instances) over the three data sets. ADR F-scores, non-ADR F-scores, Accuracies and 95% Confidence Intervals (CI) for each of the train-test set combinations are shown.

Test Data Training Data ADR F-score non-ADR F-score Accuracy (%) 95% CI

ADE ADE 0.812 0.914 88.2 87.3 – 89.1
ADE+DSALL 0.789 0.904 86.9 85.9 – 87.8
ADE+TWALL 0.800 0.912 87.7 86.8 – 88.7

TW TW 0.538 0.919 86.2 84.7 – 87.6
TW+ADEALL 0.545 0.941 88.6 87.2 – 89.7
TW+DSALL 0.597* 0.943 90.1 88.7 – 91.3

DS DS 0.678 0.890 83.8 82.2 – 85.0
DS+ADEALL 0.674 0.891 83.5 81.6 – 84.8
DS+TWALL 0.704* 0.899 85.0 83.3 – 86.5
*

indicates statistically significant improvement in performance over the highest score achieved in the binary classification task.