Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2024 May 10;14(8):3476–3492. doi: 10.1016/j.apsb.2024.05.003

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2024 The Authors

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

PMC Copyright notice

Construction of the COMDEL model. (A) The venn diagram illustrates the AMP data collected from three AMP databases including CAMPR4³³, ADAM¹⁷ and APD3¹⁵. (B) The pie chart displays the length distribution of the collected AMP and non-AMP data. (C) The COMDEL model consists of four main parts. The first is the ‘Embedding Module’, where each part of a protein sequence is turned into multiple data points based on its context. These data points are then standardized. Next, the ‘Encoding Module’ uses a special technique to understand complex sequence patterns, ensuring maximal utilization of every sequence portion. The third part, ‘Physicochemical Property Extraction’, pulls out 56 unique characteristics from the sequences, and these characteristics are processed using 13 different machine learning models. After these processes, their outcomes are merged with those the Encoding Module. Lastly, the ‘Task-Specific Module’ uses a group of neural networks to refine this information, turning it into probabilities for different categories. (D) AMP prediction accuracy comparison of the COMDEL model across 13 NNAs. (E) The AMP prediction performance of the COMDEL model in training and test datasets.