. Author manuscript; available in PMC: 2016 Aug 8.

Published in final edited form as: J Biomed Inform. 2015 Jul 2;58(Suppl):S133–S142. doi: 10.1016/j.jbi.2015.06.014

Table 4.

Summary of experiments performed to identify risk factors. PP=post-processing rules, OAI=optimization against annotation imbalance (n=number of tokens before/after annotated tokens)

CRF model	PP	OAI	Tested hypothesis
Complex	No	No	A CRF with complex features identifies more risk factors than a lexicon projection
Complex	Yes	No	Post-processing rules identify risk factors repre- sented as numerical values higher than defined threshold
Simple	Yes	No	A CRF with simple features (the token and its part-of-speech tag) identifies already known risk factors
Simple	Yes	Yes (n = 35)	The reduction of unannotated tokens occurring before and after annotated tokens counters anno- tation imbalance and improves results