Skip to main content
. 2017 Apr 18;5:e3139. doi: 10.7717/peerj.3139

Table 2. Dataset features.

A description of the features of the dataset used in this study.

Feature (s) Number Binary Description
Cosine similarity 1 No Cosine similarity of amino acid profiles in positions i and j.
Correlation measure 1 No Correlation measure of amino acid profiles in positions i and j.
Mutual information 1 No Mutual information of amino acid profiles in positions i and j.
Amino acid types 10 Yes Gives all types of amino acid in pair among nonpolar, polar, acidic, and basic.
Levitt’s contact potential 1 No Amino acid pair energy measure.
Jernigan’s pairwise potential 1 No Amino acid pair energy measure.
Braun’s pairwise potential 1 No Amino acid pair energy measure.
MSA amino acid profiles 483 No Profile of each of the 20 amino acids, plus gap, in the 18 sliding window positions and five central segment positions.
MSA entropy 23 No Profile entropy of each of the 18 sliding window positions and five central segment positions.
Solvent accessibility 46 Yes Solvent accessibility of the amino acid (buried or exposed) of each of the 18 sliding window positions and five central segment positions.
Secondary structure 69 Yes Secondary structure of the amino acid (helix, sheet, or coil) of each of the 18 sliding window positions and five central segment positions.
Central segment amino acid compositions 21 No Overall proportions of each of the 20 amino acids, plus gap, across all central segments.
Central segment secondary structure compositions 3 No Overall proportion of the three secondary structures across the central segments.
Central segment solvent accessibility compositions 2 No Overall proportion of the two solvent accessibilities across the central segments.
Amino acid sequence separation 16 Yes Amino acid sequence separation using bins <6, 6, 7, 8, 9, 10, 11, 12, 13, 14, <19, <24, ≤29, ≤39, ≤49, and ≥50.
Protein secondary structure composition 3 No Overall secondary structure composition of the protein of the contact pair.
Protein length 4 Yes Length of the protein of the contact pair using bins ≤50, ≤100, ≤150, >150.
Protein solvent accessibility composition 2 No Overall solvent accessibility composition of the protein of the contact pair.