Skip to main content
. 2019 May 24;11:35. doi: 10.1186/s13321-019-0355-6

Fig. 3.

Fig. 3

Molecule generation with the assistance of the exploration strategy during the training process. For each step of token selection, a random variable was generated between 0 and 1. If the value is larger than a pre-set threshold (exploring rate, ε), the probability distribution is determined by the current generator (exploitation network, Gθ). Otherwise, it was determined by the exploration network (Gφ)