Table 2.
Micro and macro measurements for concept recognition experiments on 188 PubMed abstracts. Neural Concept Recognizer models were trained on Human Phenotype Ontology. Largest values for each category are italicized.
| Method | Micro (%) | Macro (%) | ||||
|
|
Precision | Recall | F1-score | Precision | Recall | F1-score |
| BioLarK | 78.5 | 60.5 | 68.3 | 76.6 | 66.0 | 70.9 |
| cTAKESa | 72.2 | 55.6 | 62.8 | 74.0 | 61.4 | 67.1 |
| OBOb | 78.3 | 53.7 | 63.7 | 79.5 | 58.6 | 67.5 |
| NCBOc | 81.6 | 44.0 | 57.2 | 79.5 | 48.7 | 60.4 |
| NCRd | 80.3 | 62.4 | 70.2 | 80.5 | 68.2 | 73.9 |
| NCR-He | 74.4 | 61.5 | 67.3 | 72.2 | 67.1 | 69.6 |
| NCR-Nf | 78.1 | 62.5 | 69.4 | 76.6 | 68.3 | 72.2 |
| NCR-HNg | 77.1 | 57.2 | 65.7 | 76.5 | 63.4 | 69.3 |
acTAKES: Clinical Text Analysis and Knowledge Extraction System.
bOBO: Open Biological and Biomedical Ontologies
cNCBO: National Center for Biomedical Ontology.
dNCR: Neural Concept Recognizer.
eNCR-H: variation of the NCR model that ignores taxonomic relations.
fNCR-N: variation of the NCR model that has not been trained on negative samples.
gNCR-HN: variation of the NCR model that ignores the taxonomy and has not been trained on negative examples.