FIGURE 1.
Architecture of the recurrent neural network (RNN) language model. Given a starting amino acid, the RNN language model predicts the next amino acids residue by residue until reaching the end‐of‐sequence (EOS) signal (represented as a cross marker). Amino acids, including the EOS signal, are one‐hot encoded. The output of RNN at each time step is a probability vector of amino acid and EOS occurrence at the next position, to which sampling strategies can be applied.