Skip to main content
. 2023 Sep 20;63(19):6053–6067. doi: 10.1021/acs.jcim.3c00422

Table 3. Full Breakdown of the Combined Arrow Detection/Classification Model Metrics on Our Evaluation Seta.

  TP FN FP recall precision F-score
arrow detection 1131 43 51 96.3% 95.7% 96.0%
solid A. classification 1071 48 38 95.7% 96.6% 96.1%
curly A. classification 33 5 17 86.8% 66.0% 75.0%
equilibrium A. classification 12 4 5 75.0% 70.6% 72.7%
resonance A. classification 0 0 6 N/A 0% N/A
a

In the first row, the overall metrics for arrow detection are shown (c.f. Table 1), and in rows 2–5, evaluation of the classification task is shown. This task is assessed independently from the detection task—detected arrows are included in false negatives in rows 2–5. In the evaluation set, no resonance arrows were present.