Table 2.
Initial number and types of different features used to encode sequence fragments
Feature types | Features | Number |
---|---|---|
Physicochemical property-based | Amino acid composition, average flexibility indices, hydrophobicity indices, net charge, partition coefficient, residue volume and molecular weight | 147 (21 × 7) |
Sequence-based | Binary-encoding | 420 (21 × 20) |
Structural level | Accessible surface area; secondary structure (coil, helix and strand) and disordered regions | 105 (21 × 5) |
Functional features | Gene ontology (GO) terms (1) biological process (BP), (2) molecular function (MF) and (3) cellular component (CC); protein domain and KEGG pathway | 555 GO, 177 domain, 114 KEGG pathway |
Functional annotation | UP_SEQ_FEATURE and UP_KEYWORDS | 526 |