Table 1.
Protein features.
ID | Property name | Type | Size | Sub-names | S. cere | E. coli |
---|---|---|---|---|---|---|
1 | Amino acid occurrence20 | S | 20 | A … Y | • | • |
2 | Average amino acid PSSM20 | S | 20 | A … Y | • | • |
3 | Average cysteine position20 | S | 1 | • | • | |
4 | Average distance of every two cysteines20 | S | 1 | • | • | |
5 | Average hydrophobic20 | S | 1 | • | • | |
6 | Average hydrophobicity around cysteine20 | S | 4 | 1 … 4 | • | • |
7 | Cysteine count20 | S | 1 | • | • | |
8 | Cysteine location20 | S | 5 | 1 … 5 | • | • |
9 | Cysteine odd-even index20 | S | 1 | • | • | |
10 | Protein length20 | S | 1 | • | • | |
11 | Cell cycle5 | P | 1 | • | ||
12 | Cytoplasm5 | P | 1 | • | ||
13 | Endoplasmic reticulum5 | P | 1 | • | ||
14 | Metabolic process5 | P | 1 | • | ||
15 | Mitochondrion5 | P | 1 | • | ||
16 | Nucleus5 | P | 1 | • | ||
17 | Other process5 | P | 1 | • | ||
18 | Other localization5 | P | 1 | • | ||
19 | Signal transduction5 | P | 1 | • | ||
20 | Transport5 | P | 1 | • | ||
21 | Transcription5 | P | 1 | • | ||
22 | Betweenness centrality related to all interactions41 | T | 1 | • | • | |
23 | Betweenness centrality related to metabolic interactions5 | T | 1 | • | ||
24 | Betweenness centrality related to physical interactions5 | T | 1 | • | • | |
25 | Betweenness centrality transcriptional regulation interactions5 | T | 1 | • | ||
26 | Bit string of double screening scheme [this paper] | T | 1 | • | • | |
27 | Bottleneck8,41 | T | 1 | • | • | |
28 | Clique level7 | T | 1 | • | • | |
29 | Closeness centrality42 | T | 1 | • | • | |
30 | Clustering coefficient7 | T | 1 | • | • | |
31 | Degree related to all interactions43 | T | 1 | • | • | |
32 | Degree related to physical interactions5 | T | 1 | • | • | |
33 | Density of maximum neighborhood component4 | T | 1 | • | • | |
34 | Edge percolated component9 | T | 1 | • | • | |
35 | Indegree related to metabolic interaction5 | T | 1 | • | ||
36 | Indegree related to transcriptional regulation5 | T | 1 | • | ||
37 | Maximum neighborhood component4 | T | 1 | • | • | |
38 | Neighbors’ intra-degree7 | T | 1 | • | • | |
39 | Outdegree related to metabolic interaction5 | T | 1 | • | ||
40 | Outdegree related to transcriptional regulation interaction5 | T | 1 | • | ||
41 | Betweenness centrality related to integrated functional interaction22 | T | 1 | • | ||
42 | Betweenness centrality related to integrated PI and GC network22 | T | 1 | • | ||
43 | Degree related to integrated functional interaction22 | T | 1 | • | ||
44 | Degree related to integrated PI and GC network22 | T | 1 | • | ||
45 | Common function degree7 | O | 1 | • | ||
46 | Essential index7 | O | 1 | • | ||
47 | Identicalness5 | O | 1 | • | ||
48 | Open reading frame length7 | O | 1 | • | • | |
49 | Phyletic retention21 | O | 1 | • | • | |
50 | Number of paralagous genes21 | O | 1 | • | ||
51 | Codon Adaptation Index (CAI)21,44 | O | 1 | • | ||
52 | Codon Bias Index (CBI)21,44 | O | 1 | • | ||
53 | Frequency of optimal codons21,44 | O | 1 | • | ||
54 | Aromaticity score21,44 | O | 1 | • | ||
55 | Leading strand of the circular chromosome21 | O | 1 | • | ||
Total | 100 | 90 | 80 |
Notes:S. cere and E. coli mean Saccharomyces cerevisiae and Escherichia coli datasets, respectively. For topological features, if not particularly mentioned, they are related to physical interactions. Due to coverage or availability issue, we adopt different features for S. cere and E. coli datasets. For example, interactions in E. coli data set contain integrated functional, PI, and GC network information while those in S. cere include metabolic, transcriptional regulation and PI network information.
Abbreviations: GC, genomic context; PI, physical interactions.