Skip to main content
. 2016 Feb 24;17:101. doi: 10.1186/s12859-016-0955-3

Table 1.

All Features that were used in order to train the machine learning algorithm. Each of these features was calculated for each of the clusters

Feature Description
Aliscore The number of positions identified by Aliscore as randomly aligned
Length The length of the alignment
# of Sequences The number of sequences in the alignment
# of Gaps Number of base positions marked with a gap
# of Amino Acids Number of amino acids in the alignment
Range Longest non-aligned sequence length minus shortest non-aligned sequence length
Amino Acid Charged Standard deviation for the proportions of amino acids in the charged class for each sequence
Amino Acid Uncharged Standard deviation for the proportions of amino acids in the uncharged class for each sequence
Amino Acid Special Standard deviation for the proportions of amino acids in the non-charged and non-hydrophobic class for each sequence
Amino Acid Hydrophobic Standard deviation for the proportions of amino acids in the hydrophobic class for each sequence