Figure 1.
Schematic of variational autoencoder (VAE) antimicrobial peptide (AMP) generation and design process. The VAE AMP generation and design process occurs in two stages: (1) training the VAE for the development of the latent space and a regression model for activity prediction, and (2) sampling from the latent space, generation of new AMPs, and assignment of predicted MIC values. The first step of Stage 1 was to train the VAE on the E. coli dataset. The general design of the VAE was previously described (Dean and Walper, 2020), making use of VAE described by Bowman et al. (2015), which was reported for use in generating new sentences. Here, the number of intermediate dimensions was set to 1,024 and latent dimensions was set to 50. Training was stopped after 500 epochs or when loss decreased at a sufficiently low rate. The final state of the model was saved; the encoder is used in Stage 1 and the decoder is used in Stage 2. The MIC prediction regression model is similarly trained on the same dataset and used in Stage 2 following sequence generation by the VAE decoder to assign MIC values for those new AMPs against E. coli. A more detailed description of the framework design is provided in the Methods section.