Table 2.
Percentage of proteins | ||||
SI-based predictions | WH scheme predictionsa | |||
Data set | High intrinsic potential | Low intrinsic potential | Soluble | Insoluble |
S | 100 | 0 | 32 | 68 |
I | 30b | 70 | 22 | 78 |
T-soluble | 60 | 40 | 13 | 87 |
T-insoluble | 36 | 64 | 28 | 72 |
a By Davis et al. (1999); The solubility of the protein is predicted based on a canonical value CV, calculated CV = [15.43 × (N + G + P + S) ÷ n] -[(29.56 × |(R + K) - (D + E) ÷ n) - 0.03|] where N, G, P, S, R, K, D, and E represent the number of Asn, Gly, Pro, Ser, Arg, Lys, Asp, and Glu residues in the protein, respectively, and n is the number of amino acids in the protein. If the difference between CV and CV′ (a discriminate whose value has been set to 1.71) is positive, the protein is predicted to be insoluble, and if the difference is negative, the protein is predicted to be soluble.
b These proteins may be regarded as potential candidates for mutations studies for enhancing solubility.