Skip to main content
. 2023 Aug 19;14:5045. doi: 10.1038/s41467-023-40782-0

Table 3.

Benchmark results for datasets without added distortions—Catastrophic and severe failure rates of each model/tool on each dataset

Benchmark results for datasets without added distortions.
JPO CLEF USPTO UOB USPTO Big Indigo Img2Mol Test DECIMER-Hand drawn DECIMER-Test non-augmented
TE T<=0.3 TE T<=0.3 TE T<=0.3 TE T<=0.3 TE T<=0.3 TE T<=0.3 TE T<=0.3 TE T<=0.3 TE T<=0.3
OSRA 14% 19% 4% 4% 2% 2% 2% 2% 8% 92% 25% 42% 34% 63% 49% 73% 43% 58%
MolVec 6% 8% 3% 3% 2% 2% 2% 2% 21% 45% 28% 35% 36% 55% 34% 62% 37% 47%
Imago 23% 25% 7% 7% 3% 3% 6% 7% 19% 98% 23% 92% 27% 91% 57% 67% 35% 79%
Img2Mol 2% 7% 3% 3% 3% 3% 1% 1% 1% 2% 1% 2% 0.29% 0.32% 2% 26% 4% 4%
SwinOCSR 6% 9% 5% 6% 2% 3% 0.21% 0.33% 3% 6% 5% 8% 8% 12% 3% 12% 11% 28%
MolScribe 1% 2% 3% 3% 0.37% 0.4% 0.02% 0.02% 0.22% 0.23% 1% 1% 1% 2% 5% 17% 2% 3%
DECIMER 3% 3% 2% 2% 1% 1% 0% 0% 0.25% 0.45% 0.20% 0.21% 2% 3% 5% 17% 4% 4%

TE: Percentage of predictions with Tanimoto similarity values of zero and invalid predictions (catastrophic failure). T<=0.3: The percentage of predictions with Tanimoto similarity less than or equal to 0.3 (severe failure).

The best result for each metric on each dataset is marked in bold.