Skip to main content
. 2021 Jul 16;12:4361. doi: 10.1038/s41467-021-24547-1

Table 2.

Metrics computed on the 14,585 protein functional clusters detected as particularly linked to environmental gradients (hlePFCs).

PFC size Functional scores Taxonomy scores
Homogeneity Unknowns quantification Homogeneity Unknowns quantification
Mean 3.6% Mean homogeneity score with EggNOG annotations (Number of NA values) 0.94 (1403) EggNOG annotations PFCs only composed of annotated proteins (% of total PFCs) 12,172 (83.5%) PFCs associated to only 1 Phylum (% of total PFCs) (% of PFCs with at least one Phylum annotation) 13,987 (95.9%) (96.8%) Phylum level Only proteins from annotated MAGs (% of total PFCs) 14,049 (96.3%)
Only proteins from unannotated MAGs (% of total PFCs) 137 (0.1%)
PFCs with at least one annotated protein (%of total PFCs) 13,182 (90.4%) PFCs associated to only 1 Class (% of total PFCs) (% of PFCs with at least one Class annotation) 12,188 (83.6%) (95.7%) Class level Only proteins from annotated MAGs (% of total PFCs) 11,814 (81.0%)
Only proteins from unannotated MAGs (% of total PFCs) 1843 (12.6%)
Minimum 2 PFCs only composed of unknown proteins (%of total PFCs) 1403 (9.6%) PFCs associated with only 1 Order (% of total PFCs) (% of PFCs with at least one Order annotation) 7600 (52.1%) (94.4%) Order level Only proteins from annotated MAGs (% of total PFCs) 6206 (42.6%)
Only proteins from unannotated MAGs (% of total PFCs) 6536 (44.8%)
Mean homogeneity score with KEGG annotations (Number of NA values) 0.995 (5964) KEGG annotations PFCs only composed of annotated proteins (% of total PFCs) 6485 (44.5%) PFCs associated with only 1 Family (% of total PFCs) (% of PFCs with at least one Family annotation) 4459 (30.6%) (94.7%) Family level Only proteins from annotated MAGs (% of total PFCs) 2998 (20.6%)
Maximum 222 Only proteins from unannotated MAGs (% of total PFCs) 9876 (67.7%)
PFCs with at least one annotated protein (%of total PFCs) 8621 (59.1%) PFCs associated to only 1 Genus (% of total PFCs) (% of PFCs with at least one Genus annotation) 548 (3.8%) (66.6%) Genus level Only proteins from annotated MAGs (% of total PFCs) 367 (2.5%)
PFCs only composed of unknown proteins (%of total PFCs) 5964 (40.9%) PFCs associated with only 1 MAG (% of total PFCs) 515 (3.5%) Only proteins from unannotated MAGs (% of total PFCs) 13,762 (94.4%)

Functional scores are based on the functional annotation of MAGs proteins, with a functional homogeneity score of 1 meaning that all proteins in a PFC share the same annotation, while a score of 0 indicates that all proteins have different annotations (see “Methods”). By “unknown proteins” we refer both to sequences with no match in databases (KEGG and/or eggNOG) and to sequences existing in databases but with no functional and/or taxonomic annotation. Taxonomy scores are based on taxonomic annotations of MAGs from Delmont et al.21.