Skip to main content
. 2019 May 22;47(W1):W587–W593. doi: 10.1093/nar/gkz389

Table 1.

Improvements in concept tagger performance from PubTator to PTC for each concept type

Performance
Type Training/evaluation corpus Doc type PubTator PTC
Gene BioCreative II GN (45) Abstract GenNorm (34) 80.10% GNormPlus (35) 86.70%
Variant BRONCO (24) Full text tmVar (31) N/A tmVar 2.0 (38) 86.24%
Disease NCBI Disease (46) Abstract DNorm (32) 80.60% TaggerOne (39) 83.70%
Chemical BioCreative V CDR (41) Abstract Dictionary 53.82% TaggerOne 89.50%
Species Linnaeus (43) Full text SR4GN (33) 85.42% SR4GN (33) 85.42%
Cell Line BioCreative VI BioID corpus (44) Full text (caption) N/A TaggerOne 83.10%

Performance listed is the F1 score for concept identification (normalization). The previous version of tmVar does not provide accession identifiers (dbSNP RS numbers) for variants located within the text. Cell line annotations are new in PTC.