Skip to main content
. 2017 Dec 28;4(1):120–131. doi: 10.1021/acscentsci.7b00512

Figure 11.

Figure 11

Histogram of Levenshtein (string edit) distances of the SMILES of the reproduced molecules to their nearest neighbor in the training set (Staphylococcus aureus, model retrained on 50 actives). While in many cases the model makes changes of a few symbols in the SMILES, resembling the typical modifications applied when exploring series of compounds, the distribution of the distances indicates that the RNN also performs more complex changes by introducing larger moieties or generating molecules that are structurally different, but isofunctional to the training set.