. 2021 Sep 29;12(42):14174–14181. doi: 10.1039/d1sc01839f

Accuracy and Tanimoto similarity are reported in [%]. Best results are in bold. Benchmark: performance on the benchmark datasets described in section 3.2.2. Depiction: results for the molecular optical recognition task for different cheminformatics depiction libraries. The dataset used is a random subset of 5000 compounds from the Img2Mol test set depicted each five times (with previously mentioned augmentations) by each of the three libraries.

	Img2Mol		MolVec 0.9.8		Imago 2.0		OSRA 2.1
	Accuracy	Tanimoto	Accuracy	Tanimoto	Accuracy	Tanimoto	Accuracy	Tanimoto
Benchmark
Img2Mol	88.25	95.27	2.59	13.03	0.02	4.74	2.59	13.03
STAKER	64.33	83.76	5.32	31.78	0.07	5.06	5.23	26.98
USPTO	42.29	73.07	30.68	65.50	5.07	7.28	6.37	44.21
UoB	78.18	88.51	75.01	86.88	5.12	7.19	70.89	85.27
CLEF	48.84	78.04	44.48	76.61	26.72	41.29	17.04	58.84
JPO	45.14	69.43	49.48	66.46	23.18	37.47	33.04	49.62

Depiction
RDKit	93.4 ± 0.2	97.4 ± 0.1	3.7 ± 0.3	24.7 ± 0.1	0.3 ± 0.1	17.9 ± 0.3	4.4 ± 0.4	17.5 ± 0.5
OE	89.5 ± 0.2	95.8 ± 0.1	33.4 ± 0.4	57.4 ± 0.3	12.3 ± 0.2	32.0 ± 0.2	26.3 ± 0.4	50.0 ± 0.4
Indigo	79.0 ± 0.3	91.5 ± 0.1	22.2 ± 0.5	37.0 ± 0.5	4.2 ± 0.2	19.7 ± 0.2	22.6 ± 0.2	41.0 ± 0.2