Skip to main content
. 2018 Sep 18;2018:bay096. doi: 10.1093/database/bay096

Table 1.

Comparison of annotation counts between tokenization approaches

Re-tokenization Protein Cellular Tissue Molecule Cell Organism
Without 97.178 99.772 95.951 96.107 97.099 93.691
With 99.187 99.866 99.842 99.424 99.559 98.921

The comparison of annotation counts between preprocessing with only NERsuite tokenization module (without) and with both NERsuite tokenization and additional tokenization (with). The numbers are percents of annotations compared to the provided data presented for each entity type.