Skip to main content
. 2013 Oct 3;21(3):448–454. doi: 10.1136/amiajnl-2013-001766

Table 2 .

Model performance for on the SHARP and ShARe test sets

Relation Test corpus Model Precision Recall F1
LocationOf SHARP Baseline 1 0.900 0.096 0.174
Baseline 2 0.910 0.198 0.325
Baseline 3 0.858 0.431 0.574
Baseline 4 0.551 0.522 0.536
Baseline 5 0.758 0.340 0.470
SVM trained on SHARP 0.786 0.699 0.740
Composite (TK+features) 0.828 0.661 0.735
Human agreement 0.744
ShARe Baseline 1 1.000 0.356 0.525
Baseline 2 1.000 0.381 0.552
Baseline 3 0.971 0.777 0.863
Baseline 4 0.521 0.700 0.598
Baseline 5 0.941 0.556 0.699
SVM trained on ShARe 0.953 0.867 0.908
SVM trained on SHARP 0.916 0.883 0.899
Human agreement 0.800
DegreeOf SHARP Baseline 1 1.000 0.044 0.084
Baseline 2 1.000 0.044 0.084
Baseline 3 0.907 0.857 0.881
Baseline 4 0.896 0.758 0.821
Baseline 5 0.860 0.473 0.610
SVM trained on SHARP 0.869 0.945 0.905
Composite (TK+features) 0.840 0.923 0.880
Human agreement 0.871
ShARe Baseline 1 0.944 0.121 0.214
Baseline 2 0.947 0.128 0.225
Baseline 3 0.977 0.887 0.929
Baseline 4 0.929 0.745 0.827
Baseline 5 0.404 0.979 0.571
SVM trained on ShARe 0.929 0.929 0.929
SVM trained on SHARP 0.926 0.887 0.906
Human agreement 0.664

ShARe, Shared Annotated Resource; SHARP, Strategic Health Advanced Research Project.