. 2023 Mar 15;30(17):48929–48947. doi: 10.1007/s11356-023-26220-0

Table 2.

Various computational pipelines used for microbial genomics, proteomics, and functional diversity

S. no	Pipeline name	Usage	Description	URL	References
1	MicrobioLink	A computational pipeline to analyze microbiome–host interactions at a cellular levelusing network and systems biology approaches	MicrobioLink analyzes microbial proteins in a certain context which influences cellular processes by modulating gene or protein expression MicrobioLink facilitates to evaluate an entire microbial community or even a single microorganism, either a commensal or pathogen that can interfere with host processes via protein-mediated signal transduction		Andrighetti et al. (2020). https://doi.org/10.3390/cells9051278
2	BIOCOM-PIPE	A flexible and independent suite of tools for processing data from high-throughput sequencing technologies,	BIOCOM-PIPE is focused on the diversity of archaeal, bacterial, fungal, and photosynthetic microeukaryote amplicons It is a new pipeline designed to characterize microbial diversity from environmental DNA metabarcoding data	https://doi.org/10.5281/zenodo.3678129)	Djemiel et al. (2020). https://doi.org/10.1186/s12859-020-03829-3
3	Bactopia	provide efficient comparative genomic analyses for bacterial species or genera	Bactopia is based on Nextflow workflow software make efficient use of large clusters and cloud-computing environments to process the many thousands of genomes that are currently being generated. For users that are not familiar with bacterial genomic tools and/or who require a standardized pipeline, Bactopia is a one-stop shop that can be easily deployed using conda, Docker, and Singularity containers. For researchers with particular interest in individual species or genera, BaDs can be highly customized with taxon-specific databases Running multiple tasks on a single platform standardizes the underlying data quality used for gene and variant calling between projects run in different laboratories	https://www.github.com/bactopia/bactopia	Robert and Timothy (2020) https://doi.org/10.1128/mSystems.00190-20
4	Bacteria Genome Pipeline	An automated and scalable pipeline built on the Snakemake framework	This pipeline will be useful for researchers in low-to-middle income countries and people with little or no bioinformatics skills in analyzing raw genomics data BAGEP for monomorphic bacteria that performs quality control on FASTQ paired end files, scan reads for contaminants using a taxonomic classifier, maps reads to a reference genome of choice for variant detection, detects antimicrobial resistant (AMR) genes, constructs a phylogenetic tree from core genome alignments, and provides interactive short nucleotide polymorphism (SNP) visualization across core genomes in the data set. The objective of our research was to create an easy-to-use pipeline from existing bioinformatics tools that can be deployed on a personal computer		Olawoye et al. (2020) https://doi.org/10.7717/peerj.10121
5	MetaPhage	Automated Pipeline for Analyzing, Annotating, and Classifying Bacteriophages in Metagenomics Sequencing Data The pipeline is implemented in Nextflow	To assist the nonspecialist in the decision-making process and facilitate workflow management, we present here MetaPhage (MP), a fully automated computational pipeline for quality control, assembly, and phage detection as well as classification and quantification of these phages in metagenomics data. The pipeline is modular and enables the user to skip some of the steps and recover analysis in the event of execution errors. To guarantee scalability and reproducibility,	https://github.com/MattiaPandolfoVR/MetaPhage	Pandolfo et al. (2022). https://doi.org/10.1128/msystems.00741-22
6	Virus-seeker	The VS-Virome pipeline is controlled by a master Perl script VirusSeeker-Virome. A pipeline for novel virus discovery and virome composition analysis	This pipeline helps in quick identification of candidate viral sequences by alignment to virus only databases. It also removes false positives by alignment. Detects multiple and diverse group of RNA and DNA viruses		Zhao et al. (2017) https://doi.org/10.1016/j.virol.2017.01.005
7	MetaFlow/mics	Reproducible nextflow pipeline for the analysis of Microbiome marker data	It is a comprehensive pipelne for the analysis of microbiome marker data The pipeline produces a detailed account of the number of reads assigned to each sample and further breaks down the results by indicating whether the index matches the barcode perfectly, as well as the number of indexes containing errors. The pipeline provides a visualization of the overall read quality distribution as well as the log-transformed distribution of the number of sequences in the total samples Seamlessly scalable, interoperable, and extensible	https://github.com/hawaiidatascience/metaflowmics	Arisdakessian et al. (2020) https://doi.org/10.1145/3311790.3396664
8	ASA³P	An automatic pipeline used for assembly, annotation and higher level analysis of closely related bacterial isolates	This pipeline conducts comprehensive genome characterizations and analyses like detection of antibiotic resistance gene, identification of virulence factors, and taxonomic classification	(https://github.com/oschwengers/asap)	Schwengers et al. (2020) https://doi.org/10.1371/journal.pcbi.1007134
9	SURPI	Sequence-based ultrarapid pathogen identification	This pipeline chelps to identify pathogen from complex NGS data generated from clinical samples. It provides extensive classification of reads against viral and bacterial databases in fast mode. SURPI pipeline consists of a set of fixed external software and database dependencies and user-defined custom parameters	http:// chiulab.ucsf.edu/surpi	Naccache et al. (2014) https://doi.org/10.1101/gr.171934.113
10	Diagno Top	A computational pipeline for discriminating bacterial pathogens without database search	This pipeline differentiates the spectral clusters found in top-down proteomics data sets that is been used for microbial diagnostics without database search. A promising tool for clinical microbiology and biomarker discovery	http://patternlabforproteomics.org/diagnotop/	Lima et al. (2021) https://doi.org/10.1021/jasms.1c00014
11	ViroMatch	Computational pipeline for the detection of viral sequences from complex metagenomic data	It is an automated pipeline where metagenomic sequences are screened for putative viral reads by neucleotide mapping and translated mapping	https://github.com/twylie/viromatch	Wylie and Wyile (2021) https://doi.org/10.1128/MRA.01468-20
12	V-pipe	For assessment of viral genetic diversity from high-throughput sequencing data	This computational pipeline is a combination of statistical models and computational tools for end-to end analyses of raw sequencing reads	https:// https://github.com/cbg-ethz/V-pipe	Posada-Céspedes et al. (2021) https://doi.org/10.1093/bioinformatics/btab015
13	IDseq	Cloud-based pipeline for metagenomic pathogen detection and monitoring	This pipeline is a cloud based metagenomics pipeline for pathogen detection and monitoring. It accepts raw mNGS data, exhibits host and quality filtration steps to finally result into reads and contigs for taxonomic categorization. It is specifically designed for detection of novel pathogens	https://idseq.net	Katrina et al. (2020) https://doi.org/10.1093/gigascience/giaa111
14	HAYSTAC	This is based on novel Bayesian framework	The pipeline High AccuracY and scalable Taxonomic Assignment of MetagenomiC data (HAYSTAC) is developed as a robust and rapid species identification from high throughput sequencing data. It can easily handle the ancient and modern DNA data and also the incomplete reference databases	https://github.com/antonisdim/HAYSTAC	Dimopoulos et al. (2022) https://doi.org/10.1371/journal.pcbi.1010493
15	SeqScreen	This is for accurate and sensitive functional screening of pathogenic sequences	This pipeline accurately characterize short nucleotide sequences by using taxonomic and functional labels and customized set of curated Functions of sequences of Concern (FunSoCs) specific to microbial pathogenesis. It is a combination of machine learling classifiers, alignment based tools, curated databases and curation-based labelling of protein sequences along with custom functions for accurate identification of pathogen	www.gitlab.com/treangenlab/seqscreen	Balaji et al. (2022) https://doi.org/10.1186/s13059-022-02695-x
16	VAPiD	This is a portable and lightweight command line tool for annotation and GenBank deposition of viral genomes	Tha pipeline VAPiD is developed to facilitate the viral genome annotations. It can handle batch submission of multiple viruses without prior knowledge of viral species, correctly annotates RNA editing and ribosomal and runs with simple one line command slippage, and handles submission of metadata	https://github.com/rcs333/VAPiD	Shean et al. (2019) https://doi.org/10.1186/s12859-019-2606-y