Table S1.
Parameters of RefSeq datasets of various species and sequence similarity networks of Human Swiss-Prot/Refseq comparisons.
| Species | Gene | RefSeq | Coverage | Component | Vertex |
|---|---|---|---|---|---|
| Trichoplax adhaerens | 11514 [32] | 11540 | 1.00 | 3628 | 14947 |
| Nematostella vectensis | 18000 [33] | 24780 | 1.38 | 3771 | 13435 |
| Hydra magnipapillata | 17250 [34] | 17398 | 1.01 | 3419 | 14332 |
| Caenorhabditis elegans | 20621 [35] | 23906 | 1.16 | 3331 | 13559 |
| Caenorhabditis briggsae | 19507 [35] | 19416 | 1.00 | 3266 | 14360 |
| Drosophila melanogaster | 14422 [36] | 16688 | 1.16 | 3747 | 13766 |
| Drosophila simulans | 16003 [36] | 15428 | 0.96 | 3585 | 15077 |
| Drosophila pseidoobscura | 16388 [36] | 16071 | 0.98 | 3649 | 14880 |
| Strongylocentrotus_purpu ratus | 23300 [37] | 42324 | 1.82 | 3671 | 13255 |
| Branchiostoma floridae | 21900 [38] | 28639 | 1.31 | 3893 | 13055 |
| Ciona intestinalis | 16000 [39] | 13065 | 0.82 | 4014 | 14748 |
| Danio rerio | 32.174 [40] | 27845 | 0.87 | 3178 | 9034 |
| Xenopus tropicalis | 20.500 [40] | 8564 | 0.42 | 4283 | 16136 |
| Gallus gallus | 23.212 [41] | 18792 | 0.81 | 3964 | 11411 |
| Mus musculus | 22011 [42] | 35989 | 1.64 | 1628 | 4419 |
| Homo sapiens | 20500 [43] | 38292 | 1.87 | 193 | 419 |
Species - the RefSeq proteomes of these species were used to cluster paralogous human Swiss-Prot sequences. Gene - number of predicted genes in the given species (numbers in parenthesis refer to the references listed below). RefSeq - number of entries in the RefSeq proteome for the given species; Coverage - RefSeq/Gene, parameter reflecting the incompleteness or redundancy of the RefSeq proteome of the given species. (Note that since a dataset may be redundant and incomplete at the same time, a coverage value close to 1.0 does not guarantee that the dataset is complete and non-redundant). Component - number of components in the sequence similarity network of the comparison of the RefSeq proteome of the given species with human Swiss-Prot entries; Vertex: number of human Swiss-Prot entries in the network that are connected by sequences of the target genome.