Table 8.
Comparison of model accuracy with human annotations based on the re‐annotation of 100 herbarium specimens of test set A of EXP1‐Fertility.a
Annotation types | True positive subset accuracy | False positive subset accuracy | True negative subset accuracy | False negative subset accuracy | Overall accuracy on these subsets | Global accuracy on test set A |
---|---|---|---|---|---|---|
ResNet50‐VeryLarge | 100.0% | 0.0% | 100.0% | 0.0% | 50.0% | 96.3% |
Human annotationb | 88.0% | 68.0% | 88.0% | 76.0% | 80.0% | 87.8% |
The global accuracy on the whole test set is computed using the average of the accuracy on each subset weighted by their proportion in the whole test set, i.e., 84.1%, 1.7%, 12.2%, and 2.0%, respectively, for the true positive, false positive, true negative, and false negative subsets.
Annotations were made by co‐author P.B.