Table 1. Size of the taxon-specific protein databases derived for the dominant groups from NCBI nr protein database and the number of mRNA-tags that showed BLASTX hits to the respective database.
Taxon | Number of proteins in taxon-specific database | Number of annotated mRNA-tags and their relative proportion of total annotated mRNA-tags | |
---|---|---|---|
Oxic surface layer | Cyanobacteria | 680,891 | 18,768 (55.2%) |
Fungi | 2,525,420 | 10,189 (30.0%) | |
Xanthomonadales | 360,722 | 17,050 (50.2%) | |
Myxococcales | 185,417 | 19,821 (58.3%) | |
Methylococcales | 40,965 | 13,741 (40.4%) | |
Anoxic bulk soil | Clostridia | 1,709,159 | 15,903 (52.6%) |
Actinobacteria | 3,817,624 | 17,903 (59.3%) | |
Geobacter | 76,981 | 11,314 (37.4%) | |
Anaeromyxobacter | 36,566 | 10,455 (34.6%) | |
Euryarchaeota | 580,253 | 11,814 (39.1%) | |
Anaerolineae | 6,335 | 8,167 (27.0%) |
Note that most of the mRNA-tags had multiple taxonomic hits. The percentage values in parentheses indicate the proportion of total annotated mRNA-tags having homologous genes in the individual taxon-specific protein databases. Their homologous gene distributions were visualized in an Edwards-Venn diagram (S3 Fig).