Table 4.
Benchmark results for datasets with added distortions, such as mild shearing and rotation—Catastrophic and severe failure rates of each model/tool on each dataset
B. Benchmark results for datasets with distortions | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
JPO (dist) | CLEF (dist) | USPTO (dist) | UOB (dist) | USPTO_big (dist) | Indigo (dist) | DECIMER-Test augmented | ||||||||
TE | T<=0.3 | TE | T<=0.3 | TE | T<=0.3 | TE | T<=0.3 | TE | T<=0.3 | TE | T<=0.3 | TE | T<=0.3 | |
OSRA | 18% | 23% | 19% | 20% | 25% | 26% | 4% | 5% | 11% | 97% | 25% | 62% | 62% | 81% |
MolVec | 10% | 12% | 12% | 13% | 15% | 16% | 3% | 3% | 5% | 92% | 30% | 50% | 56% | 67% |
Imago | 42% | 46% | 28% | 29% | 16% | 16% | 27% | 29% | 9% | 99% | 23% | 95% | 42% | 92% |
Img2Mol | 3% | 7% | 3% | 4% | 3% | 3% | 1% | 1% | 1% | 6% | 1% | 3% | 4% | 8% |
SwinOCSR | 5% | 10% | 5% | 6% | 2% | 3% | 0.14% | 0.23% | 7% | 28% | 7% | 17% | 29% | 47% |
MolScribe | 0.44% | 1% | 3% | 3% | 0.39% | 0.43% | 0% | 0% | 0.23% | 0.27% | 1% | 1% | 20% | 25% |
DECIMER | 3% | 4% | 2% | 2% | 1% | 1% | 0% | 0% | 0.39% | 0.74% | 0.16% | 0.19% | 3% | 3% |
TE: Percentage of predictions with Tanimoto similarity values of zero and invalid predictions (catastrophic failure). T<=0.3: The percentage of predictions with Tanimoto similarity less than or equal to 0.3 (severe failure).
The best result for each metric on each dataset is marked in bold.