. 2021 Jul 15;12:12. doi: 10.1186/s13326-021-00247-z

Table 7.

Evaluation results for the Hallmarks of Cancer task (HOC) text classification task

	Document classification			Sentence classification
Model	Precision	Recall	F₁	Precision	Recall	F₁
Baseline (no retrofitting)	77.8	51.7	62.1	56.8	30.7	39.9
22-classes retrofitted	74.4	62.1	67.7*	49.1	35.8	41.4*
117-subclasses retrofitted	74.8	62.5	68.1*	48.6	35.2	40.8*

The Baseline model is a skip-gram model without any retrofitting. All figures are micro-averages expressed as percentages (Bold denotes the best F₁-score, * denotes statistically significant scores with respect to the baseline)