Skip to main content
. 2017 Mar 9;31(4):379–391. doi: 10.1007/s10822-016-0008-z

Fig. 3.

Fig. 3

a Perplexity scores (left) and valid grammar rate (1 − the syntax error rate) (right) with respect to 1000 SMILES strings generated from trained chemical language models. The conventional n-gram and the extended language models were trained with the BO and KN algorithms. The error bars represent the standard deviations across the 10 experiments corresponding to different training sets. b Examples of molecules generated from the trained chemical language model with n=10 (top). The bottom row displays the most similar PubChem compounds that had the Tanimoto coefficient 0.9 on the PubChem fingerprint