Skip to main content
. 2017 Dec 28;4(1):120–131. doi: 10.1021/acscentsci.7b00512

Figure 3.

Figure 3

Symbol generation and sampling process. We start with a random seed symbol s1, here c, which gets converted into a one-hot vector x1 and input into the model. The model then updates its internal state h0 to h1 and outputs y1, which is the probability distribution over the next symbols. Here, sampling yields s2 = 1. Converting s2 to x2 and feeding it to the model leads to updated hidden state h2 and output y2, from which we can sample again. This iterative symbol-by-symbol procedure can be continued as long as desired. In this example, we stop it after observing an EOL (\n) symbol, and obtain the SMILES for benzene. The hidden state hi allows the model to keep track of opened brackets and rings, to ensure that they will be closed again later.