Skip to main content
. Author manuscript; available in PMC: 2022 Nov 17.
Published in final edited form as: Nat Prod Rep. 2021 Nov 17;38(11):2041–2065. doi: 10.1039/d1np00036e

Figure 5.

Figure 5.

Examples of pattern-based, weighted pattern-based, and feature-based methods for integrating metabolomics and genomics datasets. (A) Pattern-based strategies utilize presence-absence matrices of gene cluster content and metabolite detection across strains in order to identify strongly overlapping gene cluster-metabolite pairs for targeted study. (B) Weighted pattern-based strategies, in addition to looking at presence-absence patterns, develop specific metrics to score metabolite gene cluster pairs. For example, fungal artificial chromosomes (FACs) can be used to heterologously express metabolites from yet uncharacterized gene clusters. To identify heterologously expressed metabolites from the thousands of host-encoded metabolites, a FAC-score was developed to quickly rank metabolites most likely to be encoded by the FAC-encoded gene cluster. (C) Feature based methods use BGC sequence data to infer structural features of encoded metabolites, enabling generation of predicted spectral profiles, and comparison to experimental data for targeted compound discovery.