Skip to main content
. 2020 Mar 3;11(12):3316–3325. doi: 10.1039/c9sc05704h

Evaluation of single-step retrosynthetic models. The test data set consisted of 10k entries. For every reaction we generated 10 predictions. The number of resulting precursor suggestions was 100k. Round-trip accuracy (RT), coverage (Cov.), class diversity (CD), the inverse of the Jensen–Shannon divergence of the class likelihood distributions (1/JSD), the percentage of invalid SMILES (ismi) and the human expert evaluation (hu. ev.) are reported in the table. Models with the “_i” suffix were trained on an inchified data set. Models starting with “ste” were trained with the stereo data set and the ones with “pist” with the pistachio data set.

Model Retro Forw. Test data RT [%] Cov. [%] CD graphic file with name c9sc05704h-t2.jpg ismi [%] hu. ev.
ste_i pist_i ste 81.2 95.1 1.8 16.5 0.5
ste_i pist_i pist 79.1 93.8 1.8 20.6 1.1
pist_i pist_i pist 74.9 95.3 2.1 22.0 0.5 +
pist pist_i pist 71.1 92.6 2.1 27.2 0.6 ++