Compound name recognition results of the fine-tuned model, ChemDataExtractor, and the MatSciBert model from the test set of USPTO-ORD-100K. In this task, a set of compound names (entities) is extracted from the unstructured text and is then evaluated against the ground truth.
| Model | Accurate | Removal | Addition | Alteration | Total |
|---|---|---|---|---|---|
| Fine-tuned | 94.9% | 4.1% | 2.2% | 1.0% | 78 408 |
| ChemDataExtractor | 76.1% | 16.0% | 22.7% | 8.0% | |
| MatSciBert | 96.6% | 2.2% | 2.4% | 1.2% |