Table 2.
ScispaCy entity recognition systems used corpus
Training corpus | Entity types |
---|---|
CRAFT | GGP, SO, TAXON, CHEBI, GO, CL |
JNLPBA | DNA, CELL_TYPE, CELL_LINE, RNA, PROTEIN |
BC5CDR | DNA, CELL_TYPE, CELL_LINE, RNA, PROTEIN |
BIONLP13CG | AMINO_ACID, ANATOMICAL_SYSTEM, |
CANCER, CELL, CELLULAR_COMPONENT, | |
DEVELOPING_ANATOMICAL_STRUCTURE, | |
GENE_OR_GENE_PRODUCT, IMMATERIAL_ANATOMICAL_ENTITY, | |
MULTI-TISSUE_STRUCTURE, ORGAN, ORGANISM, | |
ORGANISM_SUBDIVISION, ORGANISM_SUBSTANCE, | |
PATHOLOGICAL_FORMATION, SIMPLE_CHEMICAL, TISSUE |
List adapted from https://allenai.github.io/scispacy/