Table 2.
Metrics computed on the 14,585 protein functional clusters detected as particularly linked to environmental gradients (hlePFCs).
PFC size | Functional scores | Taxonomy scores | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Homogeneity | Unknowns quantification | Homogeneity | Unknowns quantification | ||||||||
Mean | 3.6% | Mean homogeneity score with EggNOG annotations (Number of NA values) | 0.94 (1403) | EggNOG annotations | PFCs only composed of annotated proteins (% of total PFCs) | 12,172 (83.5%) | PFCs associated to only 1 Phylum (% of total PFCs) (% of PFCs with at least one Phylum annotation) | 13,987 (95.9%) (96.8%) | Phylum level | Only proteins from annotated MAGs (% of total PFCs) | 14,049 (96.3%) |
Only proteins from unannotated MAGs (% of total PFCs) | 137 (0.1%) | ||||||||||
PFCs with at least one annotated protein (%of total PFCs) | 13,182 (90.4%) | PFCs associated to only 1 Class (% of total PFCs) (% of PFCs with at least one Class annotation) | 12,188 (83.6%) (95.7%) | Class level | Only proteins from annotated MAGs (% of total PFCs) | 11,814 (81.0%) | |||||
Only proteins from unannotated MAGs (% of total PFCs) | 1843 (12.6%) | ||||||||||
Minimum | 2 | PFCs only composed of unknown proteins (%of total PFCs) | 1403 (9.6%) | PFCs associated with only 1 Order (% of total PFCs) (% of PFCs with at least one Order annotation) | 7600 (52.1%) (94.4%) | Order level | Only proteins from annotated MAGs (% of total PFCs) | 6206 (42.6%) | |||
Only proteins from unannotated MAGs (% of total PFCs) | 6536 (44.8%) | ||||||||||
Mean homogeneity score with KEGG annotations (Number of NA values) | 0.995 (5964) | KEGG annotations | PFCs only composed of annotated proteins (% of total PFCs) | 6485 (44.5%) | PFCs associated with only 1 Family (% of total PFCs) (% of PFCs with at least one Family annotation) | 4459 (30.6%) (94.7%) | Family level | Only proteins from annotated MAGs (% of total PFCs) | 2998 (20.6%) | ||
Maximum | 222 | Only proteins from unannotated MAGs (% of total PFCs) | 9876 (67.7%) | ||||||||
PFCs with at least one annotated protein (%of total PFCs) | 8621 (59.1%) | PFCs associated to only 1 Genus (% of total PFCs) (% of PFCs with at least one Genus annotation) | 548 (3.8%) (66.6%) | Genus level | Only proteins from annotated MAGs (% of total PFCs) | 367 (2.5%) | |||||
PFCs only composed of unknown proteins (%of total PFCs) | 5964 (40.9%) | PFCs associated with only 1 MAG (% of total PFCs) | 515 (3.5%) | Only proteins from unannotated MAGs (% of total PFCs) | 13,762 (94.4%) |
Functional scores are based on the functional annotation of MAGs proteins, with a functional homogeneity score of 1 meaning that all proteins in a PFC share the same annotation, while a score of 0 indicates that all proteins have different annotations (see “Methods”). By “unknown proteins” we refer both to sequences with no match in databases (KEGG and/or eggNOG) and to sequences existing in databases but with no functional and/or taxonomic annotation. Taxonomy scores are based on taxonomic annotations of MAGs from Delmont et al.21.