Table 2.
Summary of the considered features, where y denotes one of the three secondary structure states and x denotes one of the 20 common AAs.
| Feature sets | Description |
|---|---|
| Sequence-based (79) | Sequence length (1) Composition vector (20) The number of AAs in the sequence belonging to {R group, Electronic group, Hydrophobicity group, Exchange group} (18) First and second order composition moment vector (40) |
|
| |
| PSSM-based (203) | From the PSSM matrix |
|
| |
| Secondary structure (217) | Based on the features utilized in the PSI-Pred method (90) Based on the predicted secondary structure which describes collocation of helical and strand segments (127) |
|
| |
| Average RSA based (23) | Average RSA of the residues with AA type x (20) Average RSA of the residues with secondary structure type y (3) |
|
| |
| Average isoelectric point (1) | pI = 1/N∑i=1 N pI i, the pI i values in the paper [11] |
|
| |
| Auto-correlation functions based on FHi, EHi, and Hp indices (25) | A n a = 1/(N − n)∑i=1 N−n a i a i+n, where a defines the corresponding physicochemical properties, such as two hydrophobicity indices (the Fauchere-Pliska's (FH) with n = 1,2,…, 10 and the Eisenberg's (EH) n = 1,2,…, 6), and hydropathy (HP) index with n = 1,2,…, 9. |
|
| |
| Auto-correlation functions based on cumulative FHi index (6) | A n a = ∑i=1 N−n(∑j=1 i a j) × (∑j=1 i+n a j)/(N − n), where a is the FH index with n = 1,2,…, 6. |
|
| |
| Sum of hydrophobicities based on FHi and EHi (2) |
H
sum
a = ∑i=1
N
a
i, where a is the FH or the EH index. |
|
| |
| R groups (5) | RGi, where i = 1 corresponds to nonpolar aliphatic AAs (AVLIMG), i = 2 to polar uncharged AAs (SPTCNQ), i = 3 to positively charged AAs (KHR), i = 4 to negative AAs (DE), and i = 5 to aromatic AAs (FYW); the composition percentage of each group in the sequence is computed |
|
| |
| Electronic groups (5) | EGi, where i = 1corresponds to electron donor AAs (DEPA), i = 2 to weak electron donor AAs (LIV), i = 3 to electron acceptor AAs (KNR), i = 4 to weak electron acceptor AAs (FYMTQ), and i = 5 to neutral AAs (GHWS); the composition percentage of each group in the sequence is computed |
|
| |
| Blast based (30) | Refer to subsection “Features” |
|
| |
| GLAM2-based (30) | Refer to subsection “Features” |
|
| |
| GIBBS-based (6) | Refer to subsection “Features” |