Table 2.
(Filter 1) | (Filter 1 & Filter 2) | (Filter 1 & Filter 3) | (Filter 1 & Filter 4) | (Filter 1 & Filter 5) | |
# of groups | 196 | 74 | 53 | 185 | 11 |
# of terms | 345 | 159 | 102 | 338 | 16 |
# of genes | 3213 | 711 | 409 | 2895 | 320 |
Precision | 67.91% | 62.52% | 60.52% | 67.73% | 64.70% |
Recall | 22.98% | 26.16% | 19.78% | 23.80% | 11.21% |
In order to push the values for precision and recall towards the precision ceiling, we strived for filter criteria for selecting appropriate gene groups a-priori. To achieve this goal, we defined the following filter criteria for our 1,000 'phenoclusters':
Filter 1: Removes groups with less than 3 genes, no GO-terms associated to at least 50% of genes
Filter 2: Removes groups with a GO-similarity score < 0.4
Filter 3: Removes groups with a PPi-connectedness < 33%.
Filter 4: removes all non-single species clusters.
Filter 5: removes all single-species clusters