Skip to main content
. 2015 Mar 30;10(3):e0117955. doi: 10.1371/journal.pone.0117955

Table 5. Comparison of RFs induced using non-redundant subsets of the Kinase dataset.

Threshold Non-redundant Observations (Pos/Unl) Non-redundant Dataset G Mean Entire Dataset
TP FP TN FN G Mean
20% 102 (18/84) 0.79 51 196 371 43 0.60
30% 198 (26/172) 0.85 49 165 402 45 0.61
40% 332 (49/283) 0.78 75 184 383 19 0.73
50% 432 (67/365) 0.79 72 120 447 22 0.78
60% 497 (77/420) 0.81 77 132 435 17 0.79
70% 569 (83/486) 0.79 72 118 449 22 0.78
80% 625 (88/537) 0.80 72 112 455 22 0.78
90% 650 (94/556) 0.79 69 90 477 25 0.79
100% 661 (94/567) 0.80 72 98 469 22 0.80

For each threshold, a non-redundant dataset was generated using Leaf and used to induce a RF. The RF was then used to classify the proteins in both the non-redundant dataset it was trained on and the entire Kinase dataset. The TPs/FNs are the number of positive proteins in the entire dataset predicted correctly/incorrectly, and the TNs/FPs are the number of unlabelled proteins predicted correctly/incorrectly.