Table 1.
InterPro data source | Total annotated sequences (27 379) | Sequence coverage | Residue coverage |
---|---|---|---|
BlastProDom | 425 | 0.0155 | 0.0031 |
FPrintScan | 3686 | 0.1346 | 0.0285 |
Gene3D | 12 293 | 0.4490 | 0.2799 |
HAMAP | 133 | 0.0049 | 0.0035 |
HMMPIR | 1228 | 0.0449 | 0.0469 |
HMMPANTHER | 14 973 | 0.5469 | 0.4687 |
HMMPfam | 20 859 | 0.7619 | 0.4120 |
HMMSMART | 7809 | 0.2852 | 0.1120 |
HMMTIGR | 3105 | 0.1134 | 0.0874 |
PatternScan | 5221 | 0.1907 | 0.0116 |
ProfileScan | 8798 | 0.3213 | 0.1466 |
SUPERFAMILY | 15 399 | 0.5624 | 0.4174 |
All | 22 591 | 0.8251 | 0.6932 |
GFam_NoFilter | 22 826 | 0.8337 | 0.6147 |
GFam_NoFilter_No-novel | 22 591 | 0.8251 | 0.5906 |
GFam_WithFilter | 22 634 | 0.8267 | 0.6065 |
GFam_WithFilter_No-novel | 22 382 | 0.8175 | 0.5809 |
Sequence coverage from GFam output for TAIR9 proteome was calculated from the number of sequences having at least one domain divided by the total number of sequences (the number in parenthesis in the table header). Residue coverage was calculated from the number of residues covered by at least one domain divided by the total number of residues in all the sequences. GFam_NoFilter describes coverage provided by GFam considering domain annotation provided by member resources as is. In addition, we also included coverage provided by novel domains. GFam_NoFilter_No-novel is similar to GFam_NoFilter after excluding coverage from novel domains. GFam_WithFilter describes coverage calculated after using filters (described in the text). GFam_WithFilter_No-novel is similar to GFam_WithFilter after excluding coverage from novel domains.