Skip to main content
. 2014 Oct 24;22(e1):e151–e161. doi: 10.1136/amiajnl-2014-002642

Table 3:

Tenfold cross-validation results of customized dictionary with added features in the training set (feature contribution)

Features No. of features PPV Recall F1 score
Setting 1: Baseline 6 + section parsing to discover mentions in relevant sections + section of relevant medications 109 0.759 0.806 0.782
Setting 2: Setting 1 + MTX signature 169 0.738 0.829 0.781
Setting 3: Setting 2 + DocTimeRel 409 0.740 0.868 0.798
Setting 4: Setting 2 + nearby words 4806 0.781 0.884 0.829
Setting 5: Setting 2 + nearby verbs’ part-of-speech tags 875 0.762 0.891 0.821
Setting 6: Setting 2 + nearby words + nearby verbs part-of-speech tags 5512 0.780 0.907 0.839
Setting 7: Setting 2 + nearby words + nearby verbs’ part-of-speech tags + DocTimeRel 5752 0.800 0.899 0.847
Setting 8: same feature settings as Setting 7 but patient-level classification 5752 0.814 0.727 0.768

For examples with the features, see figure 3

MTX, methotrexate; PPV, positive predictive value.