Skip to main content
. 2021 Sep 29;12(42):14174–14181. doi: 10.1039/d1sc01839f

Accuracy and Tanimoto similarity are reported in [%]. Best results are in bold. Benchmark: performance on the benchmark datasets described in section 3.2.2. Depiction: results for the molecular optical recognition task for different cheminformatics depiction libraries. The dataset used is a random subset of 5000 compounds from the Img2Mol test set depicted each five times (with previously mentioned augmentations) by each of the three libraries.

Img2Mol MolVec 0.9.8 Imago 2.0 OSRA 2.1
Accuracy Tanimoto Accuracy Tanimoto Accuracy Tanimoto Accuracy Tanimoto
Benchmark
Img2Mol 88.25 95.27 2.59 13.03 0.02 4.74 2.59 13.03
STAKER 64.33 83.76 5.32 31.78 0.07 5.06 5.23 26.98
USPTO 42.29 73.07 30.68 65.50 5.07 7.28 6.37 44.21
UoB 78.18 88.51 75.01 86.88 5.12 7.19 70.89 85.27
CLEF 48.84 78.04 44.48 76.61 26.72 41.29 17.04 58.84
JPO 45.14 69.43 49.48 66.46 23.18 37.47 33.04 49.62
Depiction
RDKit 93.4 ± 0.2 97.4 ± 0.1 3.7 ± 0.3 24.7 ± 0.1 0.3 ± 0.1 17.9 ± 0.3 4.4 ± 0.4 17.5 ± 0.5
OE 89.5 ± 0.2 95.8 ± 0.1 33.4 ± 0.4 57.4 ± 0.3 12.3 ± 0.2 32.0 ± 0.2 26.3 ± 0.4 50.0 ± 0.4
Indigo 79.0 ± 0.3 91.5 ± 0.1 22.2 ± 0.5 37.0 ± 0.5 4.2 ± 0.2 19.7 ± 0.2 22.6 ± 0.2 41.0 ± 0.2