Table 1.
Performance of regression models trained on the range of protein sequence descriptors using loco cross validation
Feature Descriptors | Regression Models | ||||||||
---|---|---|---|---|---|---|---|---|---|
OLSR | RFR | SVR | |||||||
Pr | P-value | RMSE | Pr | P-value | RMSE | Pr | P-value | RMSE | |
AAC | 0.20 | 1.5 × 10− 2 | 3.19 | 0.40 | 6.4 × 10− 7 | 2.66 | 0.40 | 1.0 × 10−6 | 2.69 |
Blosum | 0.20 | 1.4 × 10−2 | 3.10 | 0.37 | 2.8 × 10−7 | 2.71 | 0.39 | 1.5 × 10−5 | 2.67 |
propy | 0.14 | 1.3 × 10−1 | 3.67 | 0.39 | 3.0 × 10−3 | 2.64 | 0.41 | 1.1 × 10−6 | 2.60 |
PSSM | 0.19 | 7.2 × 10−1 | 3.68 | 0.38 | 1.1 × 10−5 | 2.67 | 0.37 | 1.5 × 10−5 | 2.66 |
ProtParam | 0.25 | 3.0 × 10−3 | 2.82 | 0.34 | 4.7 × 10−5 | 2.72 | 0.37 | 9.4 × 10−6 | 2.64 |
SW kernel | results not applicable | 0.37 | 2.1 × 10−6 | 2.63 | |||||
LA kernel | 0.44 | 1.2 × 10−8 | 2.56 | ||||||
MM kernel | 0.38 | 7.1 × 10−6 | 2.66 |