Table 3.
Target value | Total actuals | Correctly predicted (%) | Cost |
---|---|---|---|
Insoluble | 549 | 56.50 | 239 |
In PDB | 290 | 74.80 | 73 |
The attributes identified in Table 2 were used to calculate properties for the MCSG-INSOLUBLE and the MCSG-PDB data sets. The generated matrix was used in a clustering using a SVM approach splitting the entries 60–40%. The 60% portion was used for training while the 40% portion for evaluating the accuracy of the prediction. The SVM model building and validation were repeated five times. The insoluble proteins were predicted at 56% while the PDB deposits were predicted at 75% accuracy