Skip to main content
. 2019 Mar 26;7(1):e13039. doi: 10.2196/13039

Table 4.

Phrase-partial evaluation on the validation set.

Feature set and features P valuea Macroaverage Microaverage



Precision Recall F1 Precision Recall F1
Basic features







Basic N/Ab 0.828 0.450 0.583 0.930 0.597 0.727
Enhanced token features







Bc+Is-ICD9d-Code <.001 0.874 0.472 0.613 0.887 0.640 0.744

B+Is-Medical-Unit <.001 0.828 0.402 0.541 0.959 0.538 0.689

B+Entity-Attributes <.001 0.823 0.398 0.537 0.948 0.528 0.678

B+Stem .03 0.856 0.572 0.686 0.864 0.678 0.760
Contextual features







B+Section <.001 0.783 0.544 0.642 0.874 0.682 0.766

B+ICD9-Annotation <.001 0.888 0.462 0.608 0.928 0.598 0.727

B+ICD9-Annotation-Post <.001 0.823 0.478 0.605 0.912 0.604 0.727
Combination (B+Enhanced+Context)







B+all Enhanced (Ce+Uf+Eg+Sh)+all Context (Ti+Aj+APk) <.001 0.793 0.633 0.704 0.757 0.714 0.735

B+Enhanced (C+E+S)+all Context (T+A+AP) <.001 0.837 0.483 0.613 0.895 0.546 0.678

B+Enhanced (C+E+S)+Context (A+AP) <.001 0.874 0.529 0.659 0.906 0.630 0.743

B+all Enhanced (C+U+E+S)+Context (A+AP) l <.001 0.862 0.567 0.684 0.880 0.681 0.768

B+Enhanced (C+S)+Context (A+AP) <.001 0.799 0.509 0.622 0.896 0.616 0.730
Non-CRFm model







Only uses annotated ICD9 codes as a rule to identify constructs <.001 0.803 0.139 0.236 0.885 0.059 0.111

aWe conducted McNemar's test to measure the difference between the results of using basic features and other features.

bN/A: not applicable.

cB: basic.

dICD9: International Classification of Diseases 9.

eC: Is-ICD9-Code.

fU: Is-Medical-Unit.

gE: Entity-Attributes.

hS: stem.

iT: section.

jA: ICD9-Annotation.

kAP: ICD9-Annotation-Post.

lThe best-performing model is italicized.

mCRF: conditional random field.