Table 1.
Statistics of structure predictions by taxonomic groups. For each group, the total number of sequences in COG (Total COG sequences) is shown, along with the number and fraction (%) of chosen representatives with <50% identity (Representatives), the number and fraction of these representatives with structure predictions (Representatives predicted), the number of SMOG sequence segments with predictions (SMOG segments predicted), the number and fraction of SMOG segments fully covered by regions of structure prediction (Fully covered), the number and fraction of SMOG segments covered by a single domain region, among fully covered (Single domain). The category of "Other Bacteria" includes the bacterial groups that are less represented in the COG database (Deinococcus-Thermus, Thermotogae, Fusobacteria, Aquificae, Cyanobacteria).
Group | Total COG sequences | Representatives (%) | Representatives predicted (%) |
SMOG segments predicted |
Fully covered (% of predicted) |
Single domain (% of fully covered) |
Bacteria | ||||||
Proteobacteria alpha | 23383 | 12997 (56) | 8676 (67) | 16681 | 6671 (40) | 6329 (95) |
Proteobacteria gamma | 33375 | 12733 (38) | 7977 (63) | 15011 | 6234 (42) | 5846 (94) |
Other Proteobacteria | 10979 | 5959 (54) | 4015 (68) | 7613 | 3108 (41) | 2913 (94) |
Firmicutes | 20921 | 13626 (65) | 9314 (68) | 17249 | 7092 (41) | 6641 (94) |
Actinobacteria | 9390 | 4241 (45) | 3059 (72) | 5741 | 2408 (42) | 2257 (94) |
Chlamydiae – Spirochaetes | 2829 | 2039 (72) | 1468 (72) | 3078 | 1119 (36) | 1029 (92) |
OtherBacteria | 13870 | 10670 (77) | 7485 (70) | 14905 | 5995 (40) | 5680 (95) |
Archaea | ||||||
Euryarchaeota | 21118 | 14893 (71) | 9968 (67) | 19625 | 7625 (39) | 7235 (95) |
Crenarchaeota | 1254 | 1131 (90) | 774 (68) | 1428 | 549 (38) | 518 (94) |
Eukaryota | ||||||
Fungi | 7198 | 5880 (71) | 2778 (47) | 6801 | 3801 (56) | 3616 (95) |
Total | 144317 | 84169 (58) | 55514 (66) | 108132 | 44602 (41) | 42064 (94) |