Table 3.
Performance of different natural language processing systems.
| System | P5a | R5b | F5c | P10d | R10e | F10f | AUC-ROCrankingg | AUC-ROCKEh |
| Adapted KEA++i | 0.333 | 0.211 | 0.239 | 0.281 | 0.362 | 0.292 | 0.890 | 0.780 |
| RFj | 0.409 | 0.267 | 0.299 | 0.339 | 0.416 | 0.346 | 0.891 | 0.821 |
| FOCUSk | 0.462 | 0.305 | 0.341 | 0.369 | 0.464 | 0.381 | 0.940 | 0.866 |
| P (FOCUS vs RF) | .01 | .01 | .01 | .045 | .03 | .02 | <.001 | <.001 |
aP5: precision at rank 5.
bR5: recall at rank 5.
cF5: F-score at rank 5.
dP10: precision at rank 10.
eR10: recall at rank 10.
fF10: F-score at rank 10.
gAUC-ROCranking: area under the receiver operating characteristic curve computed on the candidate terms extracted by a system.
hAUC-ROCKE: area under the receiver operating characteristic curve (KE: keyphrase extraction) computed by using all the gold-standard important terms as positive examples.
iKEA++: extension of the keyphrase extraction algorithm KEA.
jRF: random forest.
kFOCUS: Finding impOrtant medical Concepts most Useful to patientS.