Abstract
Functional profiling of metagenomes and metatranscriptomes is essential for understanding microbial community capabilities, yet current methods require computationally expensive translated-search alignments that scale poorly to the large genome-resolved reference databases now common in the field. We introduce Leviathan, an open-source software package for integrated taxonomic and functional profiling that operates at both genome and pangenome resolution. Leviathan combines Sylph for ultra fast alignment-free taxonomic profiling with Salmon for pseudo-alignment-based read quantification in DNA-space against genome-resolved gene catalogs, bypassing the translated-search step that dominates runtime in existing approaches. For each (pan)genome, Leviathan functional profiling produces dual metrics: pathway abundance from aggregated gene-level quantification and pathway coverage from graph-based assessment of enzymatic step completeness. On CAMI-I and CAMI-II datasets, Leviathan achieved up to 74-fold faster runtimes and 14-fold lower memory usage compared to HUMAnN, while improving genome-level assignment accuracy by up to 12% and pangenome-level accuracy by up to 5%. We demonstrate Leviathan's applicability through two case studies: a marine plastisphere metagenomics dataset where differential coverage analysis revealed metabolic shifts between early and mature biofilm communities and a dental caries metatranscriptomics dataset where pangenome-resolved co-expression network analysis identified organism-specific transcriptional patterns diagnostic of health and disease states. Leviathan is available at https://github.com/jolespin/leviathan.
Full Text
The Full Text of this preprint is available as a PDF (1.2 MB). The Web version will be available soon.
