Skip to main content
. 2017 Dec 13;37(1-2):1700123. doi: 10.1002/minf.201700123

Figure 3.

Figure 3

Sequence generation using teachers forcing. The last decoder trained with teachers forcing receives two inputs: the output of the previous layer and a character from the previous time step. In the training mode, the previous character is equal to the corresponding character from the input sequence, regardless of the probability output. During the generation mode the decoder samples at each time step a new character based on the output probability and uses this as input for the next time step.