Skip to main content
. Author manuscript; available in PMC: 2015 Dec 22.
Published in final edited form as: KDD. 2015 Aug;2015:995–1004. doi: 10.1145/2783258.2783362

Table 5. Performance comparisons on three datasets in terms of Precision, Recall and F1 score.

Data sets NYT Yelp Tweet

Method Precision Recall F1 Precision Recall F1 Precision Recall F1
Pattern [7] 0.4576 0.2247 0.3014 0.3790 0.1354 0.1996 0.2107 0.2368 0.2230
FIGER [12] 0.8668 0.8964 0.8814 0.5010 0.1237 0.1983 0.7354 0.1951 0.3084
SemTagger [9] 0.8667 0.2658 0.4069 0.3769 0.2440 0.2963 0.4225 0.1632 0.2355
APOLLO [22] 0.9257 0.6972 0.7954 0.3534 0.2366 0.2834 0.1471 0.2635 0.1883
NNPLB [11] 0.7487 0.5538 0.6367 0.4248 0.6397 0.5106 0.3327 0.1951 0.2459

ClusType-NoClus 0.9130 0.8685 0.8902 0.7629 0.7581 0.7605 0.3466 0.4920 0.4067
ClusType-NoWm 0.9244 0.9015 0.9128 0.7812 0.7634 0.7722 0.3539 0.5434 0.4286
ClusType-TwoStep 0.9257 0.9033 0.9143 0.8025 0.7629 0.7821 0.3748 0.5230 0.4367
ClusType 0.9550 0.9243 0.9394 0.8333 0.7849 0.8084 0.3956 0.5230 0.4505