Skip to main content
. 2018 Dec 6;8(1):giy150. doi: 10.1093/gigascience/giy150

Table 2:

Random forest classifier performance discriminating high- from lower-confidence predicted protein structures

Hold-out data Predicted All data Predicted
HC LC Class error HC LC Class error
All metrics Actual HC 228 22 0.088 1054 34 0.031
LC 24 226 0.096 305 3437 0.082
% AA identity omitted  Actual HC 227 23 0.092 1048 40 0.037
LC 29 221 0.116 349 3393 0.093

Metrics for 71 mainly ribosomal protein structures were insufficient for inclusion in data sets for the random forest.