Skip to main content
. 2014 Jul 25;42(15):10086–10098. doi: 10.1093/nar/gku681

Table 1. Summary of feature columns and fragment sizes.

Feature Name No. of columns No. of residues per fragment window
21-bit Sparse Coding 21 per residue 11 sequential residues (a sliding window of size 5)
GAC 20 per fragment whole protein sequence
PSSM 20 per residue 11 sequential residues
EC 1 per residue 11 sequential residues
LN 5 per residue 11 sequential residues
Normalized ASAs 3 per residue 11 spatial residues (a neighboring window of size 10)
Physicochemical (PC) property 10 per fragment 21 spatial residues (a neighboring window of size 20)
Predicted secondary structure 8 per residue 11 sequential residues

Since the GAC is calculated from a single protein sequence, for each coding fragment, a GAC vector will be appended. For the PC feature, for a coding fragment a list of 21 neighboring residues will return 10 values.