Table 1.
All Features that were used in order to train the machine learning algorithm. Each of these features was calculated for each of the clusters
Feature | Description |
---|---|
Aliscore | The number of positions identified by Aliscore as randomly aligned |
Length | The length of the alignment |
# of Sequences | The number of sequences in the alignment |
# of Gaps | Number of base positions marked with a gap |
# of Amino Acids | Number of amino acids in the alignment |
Range | Longest non-aligned sequence length minus shortest non-aligned sequence length |
Amino Acid Charged | Standard deviation for the proportions of amino acids in the charged class for each sequence |
Amino Acid Uncharged | Standard deviation for the proportions of amino acids in the uncharged class for each sequence |
Amino Acid Special | Standard deviation for the proportions of amino acids in the non-charged and non-hydrophobic class for each sequence |
Amino Acid Hydrophobic | Standard deviation for the proportions of amino acids in the hydrophobic class for each sequence |