Skip to main content
. 2018 Oct 1;25(10):1274–1283. doi: 10.1093/jamia/ocy114

Table 1.

Performance metrics for selected system submissions for subtask-1, baselines, and system ensembles. Precision, recall, and F1-score over the ADR class are shown. The top F1-score among all systems is shown in bold. Detailed discussions about the approaches can be found in the system description papers referenced

System/Team ADR precision ADR recall ADR F1-score
Baseline 1: Naïve Bayes 0.774 0.098 0.174
Baseline 2: SVMs with RBF kernel 0.501 0.215 0.219
Baseline 3: Random Forest 0.429 0.066 0.115
NRC-Canada35 0.392 0.488 0.435
CSaRUS-CNN50(Arizona State University) 0.437 0.393 0.414
NorthEasternNLP51(NorthEastern University) 0.395 0.431 0.412
UKNLP41(University of Kentucky) 0.498 0.337 0.402
TsuiLab52(University of Pittsburgh) 0.336 0.348 0.342
Ensemble all: best configuration (>6 ADR votes) 0.435 0.492 0.461
Ensemble top 7: majority vote (>3) 0.529 0.398 0.454
Ensemble top 7: >2 ADR votes 0.462 0.492 0.476
Ensemble top 5: majority vote (>2) 0.521 0.415 0.462
Ensemble top 5: at least 1 ADR vote 0.304 0.641 0.413
Ensemble top 3: >1 ADR vote 0.464 0.441 0.452