Skip to main content
. Author manuscript; available in PMC: 2024 Apr 1.
Published in final edited form as: Nat Biotechnol. 2023 Feb 13;41(10):1416–1423. doi: 10.1038/s41587-023-01675-1

Extended Data Figure 3: Network of putative non-redundant MGCs predicted by gutSMASH.

Extended Data Figure 3:

From all the unknown predicted MGCs, a redundancy filtering of 0.9 sequence similarity was applied using MMseqs2. From each cluster, two representatives were picked, and all representatives were used as input for BiG-SCAPE using the default cut-offs. The network contains 2,921 nodes and 7,474 edges. The MGCs have been classified into four different categories based on the key enzyme classes they code for. The GR (glycyl-radical) category is composed of MGCs that include pyruvate formate-lyase (PFL-like) and/or glycyl radical (Gly_radical), OD (oxidative decarboxylation) involves MGCs with at least one of the following Pfam domains: Pyruvate ferredoxin/flavodoxin oxidoreductase (POR), Pyruvate flavodoxin/ferredoxin oxidoreductase, thiamine diP-bdg (POR_N), Pyruvate:ferredoxin oxidoreductase core domain II (PFOR_II) and Thiamine pyrophosphate enzyme, C-terminal TPP binding domain (TPP_enzyme_C). The Flavoenzymes category is a combination of MGCs harbouring at least one of the custom-made BaiCD and BaiH pHMMs. HGD-D-related MGCs, as the name states, include enzymes matching any of the 2-hydroxyglutaryl-CoA dehydratase, D-component (HGD-D)-related pHMM domains.