. 2023 Aug 19;14:5045. doi: 10.1038/s41467-023-40782-0

Table 4.

Benchmark results for datasets with added distortions, such as mild shearing and rotation—Catastrophic and severe failure rates of each model/tool on each dataset

B. Benchmark results for datasets with distortions
	JPO (dist)		CLEF (dist)		USPTO (dist)		UOB (dist)		USPTO_big (dist)		Indigo (dist)		DECIMER-Test augmented
	T_E	T_<=0.3	T_E	T_<=0.3	T_E	T_<=0.3	T_E	T_<=0.3	T_E	T_<=0.3	T_E	T_<=0.3	T_E	T_<=0.3
OSRA	18%	23%	19%	20%	25%	26%	4%	5%	11%	97%	25%	62%	62%	81%
MolVec	10%	12%	12%	13%	15%	16%	3%	3%	5%	92%	30%	50%	56%	67%
Imago	42%	46%	28%	29%	16%	16%	27%	29%	9%	99%	23%	95%	42%	92%
Img2Mol	3%	7%	3%	4%	3%	3%	1%	1%	1%	6%	1%	3%	4%	8%
SwinOCSR	5%	10%	5%	6%	2%	3%	0.14%	0.23%	7%	28%	7%	17%	29%	47%
MolScribe	0.44%	1%	3%	3%	0.39%	0.43%	0%	0%	0.23%	0.27%	1%	1%	20%	25%
DECIMER	3%	4%	2%	2%	1%	1%	0%	0%	0.39%	0.74%	0.16%	0.19%	3%	3%

T_E: Percentage of predictions with Tanimoto similarity values of zero and invalid predictions (catastrophic failure). T_<=0.3: The percentage of predictions with Tanimoto similarity less than or equal to 0.3 (severe failure).

The best result for each metric on each dataset is marked in bold.