Skip to main content
. 2023 Feb 10;21:1630–1638. doi: 10.1016/j.csbj.2023.02.017

Fig. 2.

Fig. 2

Diagram of the machine learning framework. (a) Every three amino acids, GXY, of the two red α1 chains and one green α2 chain was transformed into the 2048 vector embedding by the encoder model. (b) The 2048 vector embeddings of the sequences in red and green were encoded into a 100 vector embedding of the sequence in yellow and blue by the variational autoencoder model. (c) The strain-predictive model was trained based on XGBoost with the input, 100 vector embeddings of sequences and output of the predicted strain of each combination of sequences.