Table 1. Evaluation Metrics of the Key Steps in the Operational Pipeline of ReactionDataExtractor v2.0a.
pipeline deliverable | TP | FN | FP | recall | precision | F-score |
---|---|---|---|---|---|---|
arrow detection | 1131 | 43 | 51 | 96.3% | 95.7% | 96.0% |
diagram detection | 2292 | 237 | 84 | 90.6% | 96.5% | 93.5% |
label detection | 1439 | 464 | 504 | 75.6% | 74.1% | 74.8% |
arrow annotation detection | 756 | 299 | 167 | 71.7% | 81.9% | 76.4% |
diagram-label matching | 1255 | 235 | N/A | 84.2% | N/A | N/A |
overall reaction graph evaluation | 878 | 289 | 120 | 75.2% | 88.0% | 81.1% |
For diagram-label matching, we assessed correct and incorrect matching, denoted them as true positives and false negatives, respectively, and computed the equivalent recall metric.