Table 3.
Coverage and Agreement of Predictions with EBI-Curated Annotation of 4357 Human Genes Found in SWISS-PROT Using Both ProDom and CDD Rules
IEAa | Rule subsetb | P-value ratioc | Coveraged | Agreemente | ||
---|---|---|---|---|---|---|
yes | some | none | ||||
No | All | All | 3517 (81%) | 74% | 9% | 18% |
High+Medium Confidence | All | 2631 (74%) | 81% | 6% | 12% | |
>= 1 | 2463 (70%) | 82% | 6% | 12% | ||
< 1 | 400 (11%) | 67% | 16% | 17% | ||
Low confidence | All | 2821 (80%) | 67% | 11% | 23% | |
>= 1 | 2685 (76%) | 69% | 10% | 20% | ||
< 1 | 601 (17%) | 39% | 15% | 46% | ||
‘One Protein’ | All | 2275 (65%) | 61% | 15% | 25% | |
>= 1 | 2052 (58%) | 64% | 14% | 23% | ||
< 1 | 486 (14%) | 37% | 22% | 41% | ||
H+M - ‘One Protein’ | >= 1 | 2201 (63%) | 84% | 5% | 11% | |
Yes | All | All | 3909 (90%) | 76% | 7% | 17% |
H+M - ‘One Protein’ | >= 1 | 2902 (74%) | 83% | 5% | 12% |
Indicates whether IEA GO Function associations were used to build rules. bIndicates the subsets of the rules that were considered when making predictions. We considered difference rule confidences and rule type. Though ‘one protein’ rules are less desirable a priori, some have high confidence if the p-value threshold is low or they have been manually reviewed. “H+M-‘One protein’ ” means the high and medium confidence rules that are not ‘one protein’ rules. cIndicates what subset of similarities between human proteins and domains were considered when making predictions. The p-value ratio is defined as -log(sim pv)/ -log(rule pv), so a value less than one indicates a similarity that is less stringent than the threshold associated with the rule dNumber and percent coverage of proteins. Bold entries indicate coverage based on 4357 proteins, others are based on number of proteins covered with all rules and all p-values with or without use of IEA annotation. Note that multiple functions are often predicted for proteins and so coverage percents may sum to more than 100%. eBest agreement per protein between predictions and curated annotation is categorized as agreeing (exactly or with more or less specificity), showing some agreement (paths to terms overlap but differ at leaf terms), or showing no agreement.