Table 4.
Phrase-partial evaluation on the validation set.
Feature set and features | P valuea | Macroaverage | Microaverage | ||||||
|
|
|
Precision | Recall | F1 | Precision | Recall | F1 | |
Basic features |
|
|
|
|
|
|
|
||
|
Basic | N/Ab | 0.828 | 0.450 | 0.583 | 0.930 | 0.597 | 0.727 | |
Enhanced token features |
|
|
|
|
|
|
|
||
|
Bc+Is-ICD9d-Code | <.001 | 0.874 | 0.472 | 0.613 | 0.887 | 0.640 | 0.744 | |
|
B+Is-Medical-Unit | <.001 | 0.828 | 0.402 | 0.541 | 0.959 | 0.538 | 0.689 | |
|
B+Entity-Attributes | <.001 | 0.823 | 0.398 | 0.537 | 0.948 | 0.528 | 0.678 | |
|
B+Stem | .03 | 0.856 | 0.572 | 0.686 | 0.864 | 0.678 | 0.760 | |
Contextual features |
|
|
|
|
|
|
|
||
|
B+Section | <.001 | 0.783 | 0.544 | 0.642 | 0.874 | 0.682 | 0.766 | |
|
B+ICD9-Annotation | <.001 | 0.888 | 0.462 | 0.608 | 0.928 | 0.598 | 0.727 | |
|
B+ICD9-Annotation-Post | <.001 | 0.823 | 0.478 | 0.605 | 0.912 | 0.604 | 0.727 | |
Combination (B+Enhanced+Context) |
|
|
|
|
|
|
|
||
|
B+all Enhanced (Ce+Uf+Eg+Sh)+all Context (Ti+Aj+APk) | <.001 | 0.793 | 0.633 | 0.704 | 0.757 | 0.714 | 0.735 | |
|
B+Enhanced (C+E+S)+all Context (T+A+AP) | <.001 | 0.837 | 0.483 | 0.613 | 0.895 | 0.546 | 0.678 | |
|
B+Enhanced (C+E+S)+Context (A+AP) | <.001 | 0.874 | 0.529 | 0.659 | 0.906 | 0.630 | 0.743 | |
|
B+all Enhanced (C+U+E+S)+Context (A+AP) l | <.001 | 0.862 | 0.567 | 0.684 | 0.880 | 0.681 | 0.768 | |
|
B+Enhanced (C+S)+Context (A+AP) | <.001 | 0.799 | 0.509 | 0.622 | 0.896 | 0.616 | 0.730 | |
Non-CRFm model |
|
|
|
|
|
|
|
||
|
Only uses annotated ICD9 codes as a rule to identify constructs | <.001 | 0.803 | 0.139 | 0.236 | 0.885 | 0.059 | 0.111 |
aWe conducted McNemar's test to measure the difference between the results of using basic features and other features.
bN/A: not applicable.
cB: basic.
dICD9: International Classification of Diseases 9.
eC: Is-ICD9-Code.
fU: Is-Medical-Unit.
gE: Entity-Attributes.
hS: stem.
iT: section.
jA: ICD9-Annotation.
kAP: ICD9-Annotation-Post.
lThe best-performing model is italicized.
mCRF: conditional random field.