Skip to main content
. 2021 Nov 24;19:250. doi: 10.1186/s12915-021-01180-4

Fig. 1.

Fig. 1.

The pipeline of HostG. I: Using the pre-trained CNN model to encode contigs into node feature vectors. II: Utilizing BLASTN to create virus-host connections. III: Creating protein clusters using DIAMOND-based BLASTP and MCL. Then, the protein clusters will be employed to create virus-virus connections. IV: Creating the knowledge graph by combining the node feature and edge connections. Then, GCN is employed to train and assign taxonomic labels