Skip to main content
. 2022 Dec 6;13:7516. doi: 10.1038/s41467-022-34388-1

Fig. 1. Phylogeny of four newly proposed and one understudied phyla and an overview of their metabolic potential and global distribution.

Fig. 1

a A maximum likelihood phylogenetic tree of 345 genomes including the 55 metagenome assembled genomes (MAGs) described in this study. The phylogeny is based on 37 concatenated ribosomal protein encoding genes identified using PhyloSift. The five lineages are marked in different background colors with symbols indicating the environmental source of each genome. The metabolic potential of newly reconstructed genomes is shown in the outer heatmap for nitrogen (N), iron (Fe), oxygen (O), carbon (C), and sulfur (S), determined using Metagenomic Entropy Based Scores (MEBS). These entropy-based scores indicate the likelihood of a given genome to be involved in main biogeochemical cycles. Metabolically related genomes based on presence/absence of protein families are shown as Pfam clusters (see methods). Bootstraps are shown in purple circles (≥75). b The global distribution of the five phyla described in this study in a map generated using ‘ggmap’ and ‘maptools’ package in R. The phyla are highlighted in five distinct colors. Habitats where these phyla were identified (based on 16S rRNA sequence homology using BLAST and further confirmed with phylogeny, thresholds were listed in Supplementary Data 6) are shown with 15 different shapes. Source data are provided as a Source Data file.