Table 2.
Method | Protein dataseta | Method | Peptide datasetb | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE | MAE | R 2 | Outliersc | RMSE | MAE | R 2 | Outliersc | ||
IPC2.protein.svr.19 | 0.8479 | 0.5906 | 0.5934 | 247 | IPC2.peptide.Conv2D | 0.2216 | 0.1216 | 0.9761 | 2691 |
IPC2_protein | 0.8608 | 0.6052 | 0.5748 | 251 | IPC2.peptide.svr.19 | 0.2299 | 0.1155 | 0.9743 | 2490 |
IPC_protein | 0.8677 | 0.6109 | 0.5760 | 250 | IPC2_peptide | 0.2482 | 0.1394 | 0.9700 | 3179 |
ProMoST | 0.9113 | 0.6444 | 0.5183 | 263 | Bjellqvist | 0.4051 | 0.2836 | 0.9204 | 11639 |
Toseland | 0.9278 | 0.6537 | 0.5095 | 250 | Nozaki | 0.4083 | 0.2673 | 0.9191 | 9837 |
Dawson | 0.9365 | 0.6586 | 0.4977 | 263 | DTASelect | 0.4235 | 0.2796 | 0.9130 | 10606 |
Bjellqvist | 0.9369 | 0.6536 | 0.5005 | 260 | Thurlkill | 0.4466 | 0.2535 | 0.9033 | 7182 |
Wikipedia | 0.9484 | 0.6795 | 0.4860 | 262 | Sillero | 0.4747 | 0.2696 | 0.8907 | 7607 |
Rodwell | 0.9579 | 0.6762 | 0.4706 | 262 | Dawson | 0.4910 | 0.2642 | 0.8831 | 6698 |
Grimsley | 0.9588 | 0.6953 | 0.4779 | 265 | Wikipedia | 0.5178 | 0.2974 | 0.8700 | 8326 |
Lehninger | 0.9617 | 0.6783 | 0.4607 | 266 | Grimsley | 0.5264 | 0.3796 | 0.8656 | 15956 |
Solomon | 0.9631 | 0.6746 | 0.4606 | 272 | Rodwell | 0.5855 | 0.3429 | 0.8337 | 9857 |
pIR | 1.0148 | 0.7556 | 0.4161 | 315 | Toseland | 0.5860 | 0.3896 | 0.8335 | 13152 |
Nozaki | 1.0164 | 0.7219 | 0.3980 | 288 | EMBOSS | 0.5971 | 0.3557 | 0.8271 | 11022 |
Thurlkill | 1.0250 | 0.7573 | 0.3948 | 302 | PredpI-iTRAQ8 | 0.6302 | 0.3503 | 0.8027 | 12059 |
DTASelect | 1.0278 | 0.7798 | 0.3947 | 319 | PredpI-TMT6 | 0.6365 | 0.3518 | 0.7988 | 12135 |
EMBOSS | 1.0498 | 0.7757 | 0.3734 | 308 | PredpI-plain | 0.6480 | 0.3710 | 0.7913 | 12813 |
Sillero | 1.0519 | 0.7694 | 0.3461 | 308 | IPC_peptide | 0.7459 | 0.4860 | 0.7302 | 13599 |
Patrickios | 2.3764 | 1.8414 | <0 | 517 | Solomon | 0.7518 | 0.4929 | 0.7259 | 13777 |
PredpI-TMT6 | NA | NA | NA | NA | Lehninger | 0.7697 | 0.5209 | 0.7127 | 15200 |
PredpI-plain | NA | NA | NA | NA | pIR | 0.8529 | 0.7303 | 0.6387 | 27158 |
PredpI-iTRAQ8 | NA | NA | NA | NA | ProMoST | 1.1026 | 0.7562 | 0.4104 | 18513 |
Patrickios | 2.0172 | 1.3927 | <0 | 22818 |
aProtein dataset consisting of 581 proteins (25% randomly chosen proteins, not used for the training or optimization).
bPeptide dataset consisting of 29 774 peptides (25% randomly chosen peptides, not used for the training or optimization).
cThe outliers were defined at 0.5 and 0.25 pH unit difference between the predicted and experimental pI thresholds for the protein and peptide datasets.
NA: The PredpI program was designed for peptides only within the 3.7–4.9 pH range; thus, for proteins, it returned 0 and could not be evaluated on the protein dataset.
New machine learning models developed in this study are in bold. First version of IPC (12) is underscored. Scores calculated after 10-fold cross-validation. Table is sorted by RMSD. For individual methods’ predictions, see Supplementary Data 2. For more details about the datasets, see Table 1.