Table 2.
Carbonylation predictor | Machine learning algorithm | Parameters | Data Set | Area under the ROC1 | Reference |
---|---|---|---|---|---|
CarSPred | Weighted support vector machine | Position-specific propensity of amino acid, k-spaced amino acid pair, KNN scores, physicochemical properties (electric properties, hydrophobicity, alpha and turn propensities, etc.) |
331 lysine, 131 arginine, 128 threonine, and 129 proline carbonylation sites were extracted from 230 carbonylated human proteins. In addition, 22 lysine, 3 arginine, 6 threonine, and 15 proline carbonylation sites were extracted from carbonylated mouse, rabbit and bovine proteins. |
Lysine: 0.6704 Arginine: 0.5345 Threonine: 0.6800 Proline: 0.7873 |
106 |
iCar-PseCp | Random forest | pseudo amino acid composition | Data was derived from 230 human carbonylated protein sequences and 20 carbonylated proteins from Photobacterium and Escherichia coli. | Lysine: 0.8728 Arginine: 0.8668 Threonine: 0.8603 Proline: 0.8484 |
108 |
iCarPS | Random forest | 3-D conical coordinates and physicochemical properties (hydrophobicity, hydrophilicity, mass, pK1, pK2, pI, rigidity, flexibility, and irreplaceability) | Same benchmark dataset as Lv, et al. (2014).109 | Lysine: 0.789 Arginine: 0.726 Threonine: 0.790 Proline: 0.814 |
109 |
1Area under the curve was derived for the ROC. The ROC plots the sensitivity (i.e., true positive rate) versus selectivity.