Amino acids are clustered into seven classes (C1-C7). A, The distribution of the lengths of sequences from the N-terminus that contain the first, first 25%, 50%, 75%, and 100% of each amino acid class in the protein sequence are represented in blue, pink, green, and yellow, respectively. The dotted line represents the position of the first, first 25%, 50%, 75%, and 100% of Class 1 in the local region. The number of distribution descriptor is 7 (classes) *5 (distribution values) =35 for a local region. B, The composition of each amino acid class in a local region is considered. C, The transition accounts for the frequency of the transition from one class to another. The number of transition descriptor is (7*6)/2=21. Therefore, each local region is represented by 35+7+21=63 descriptors.