Table 1.
Descriptive statistics of the medical named entity recognition datasets.
| Dataset | Medical entity type | Entity types, n | Annotations, n |
| Revised JNLPBAa | DNA, RNA, protein, cell line, and cell type | 5 | 52,785 |
| BC5CDRb | Disease and chemical | 2 | 38,596 |
| AnatEMc | Organism subdivision, anatomical system, organ, multi-tissue structure, tissue, cell, developing anatomical structure, cellular component, organism substance, immaterial anatomical entity, pathological formation, and cancer | 12 | 11,562 |
aJNLPBA: Joint Workshop on Natural Language Processing in Biomedicine and its Applications.
bBC5CDR: BioCreative V CDR.
cAnatEM: Anatomical Entity Mention.