Table 2:
Random forest classifier performance discriminating high- from lower-confidence predicted protein structures
Hold-out data Predicted | All data Predicted | |||||||
---|---|---|---|---|---|---|---|---|
HC | LC | Class error | HC | LC | Class error | |||
All metrics | Actual | HC | 228 | 22 | 0.088 | 1054 | 34 | 0.031 |
LC | 24 | 226 | 0.096 | 305 | 3437 | 0.082 | ||
% AA identity omitted | Actual | HC | 227 | 23 | 0.092 | 1048 | 40 | 0.037 |
LC | 29 | 221 | 0.116 | 349 | 3393 | 0.093 |
Metrics for 71 mainly ribosomal protein structures were insufficient for inclusion in data sets for the random forest.