Skip to main content
. 2023 Mar 15;30(17):48929–48947. doi: 10.1007/s11356-023-26220-0

Table 2.

Various computational pipelines used for microbial genomics, proteomics, and functional diversity

S. no Pipeline name Usage Description URL References
1 MicrobioLink A computational pipeline to analyze microbiome–host interactions at a cellular levelusing network and systems biology approaches

MicrobioLink analyzes microbial proteins in a certain context which influences cellular processes by modulating gene or protein expression

MicrobioLink facilitates to evaluate an entire microbial community or even a single microorganism, either a commensal or pathogen that can interfere with host processes via protein-mediated signal transduction

Andrighetti et al. (2020). https://doi.org/10.3390/cells9051278
2 BIOCOM-PIPE A flexible and independent suite of tools for processing data from high-throughput sequencing technologies,

BIOCOM-PIPE is focused on the diversity of archaeal, bacterial, fungal, and photosynthetic microeukaryote amplicons

It is a new pipeline designed to characterize microbial diversity from environmental DNA metabarcoding data

https://doi.org/10.5281/zenodo.3678129) Djemiel et al. (2020). https://doi.org/10.1186/s12859-020-03829-3
3 Bactopia provide efficient comparative genomic analyses for bacterial species or genera

Bactopia is based on Nextflow workflow software make efficient use of large clusters and cloud-computing environments to process the many thousands of genomes that are currently being generated. For users that are not familiar with bacterial genomic tools and/or who require a standardized pipeline, Bactopia is a one-stop shop that can be easily deployed using conda, Docker, and Singularity containers. For researchers with particular interest in individual species or genera, BaDs can be highly customized with taxon-specific databases

Running multiple tasks on a single platform standardizes the underlying data quality used for gene and variant calling between projects run in different laboratories

https://www.github.com/bactopia/bactopia

Robert and Timothy (2020)

https://doi.org/10.1128/mSystems.00190-20

4 Bacteria Genome Pipeline An automated and scalable pipeline built on the Snakemake framework

This pipeline will be useful for researchers in low-to-middle income countries and people with little or no bioinformatics skills in analyzing raw genomics data

BAGEP for monomorphic bacteria that performs quality control on FASTQ paired end files, scan reads for contaminants using a taxonomic classifier, maps reads to a reference genome of choice for variant detection, detects antimicrobial resistant (AMR) genes, constructs a phylogenetic tree from core genome alignments, and provides interactive short nucleotide polymorphism (SNP) visualization across core genomes in the data set. The objective of our research was to create an easy-to-use pipeline from existing bioinformatics tools that can be deployed on a personal computer

Olawoye et al. (2020) https://doi.org/10.7717/peerj.10121
5 MetaPhage

Automated Pipeline for Analyzing, Annotating, and Classifying Bacteriophages in Metagenomics Sequencing Data

The pipeline is implemented in Nextflow

To assist the nonspecialist in the decision-making process and facilitate workflow management, we present here MetaPhage (MP), a fully automated computational pipeline for quality control, assembly, and phage detection as well as classification and quantification of these phages in metagenomics data. The pipeline is modular and enables the user to skip some of the steps and recover analysis in the event of execution errors. To guarantee scalability and reproducibility, https://github.com/MattiaPandolfoVR/MetaPhage Pandolfo et al. (2022). https://doi.org/10.1128/msystems.00741-22
6 Virus-seeker The VS-Virome pipeline is controlled by a master Perl script VirusSeeker-Virome. A pipeline for novel virus discovery and virome composition analysis This pipeline helps in quick identification of candidate viral sequences by alignment to virus only databases. It also removes false positives by alignment. Detects multiple and diverse group of RNA and DNA viruses

Zhao et al. (2017)

https://doi.org/10.1016/j.virol.2017.01.005

7 MetaFlow/mics Reproducible nextflow pipeline for the analysis of Microbiome marker data

It is a comprehensive pipelne for the analysis of microbiome marker data

The pipeline produces a detailed account of the number of reads assigned to each sample and further breaks down the results by indicating whether the index matches the barcode perfectly, as well as the number of indexes containing errors. The pipeline provides a visualization of the overall read quality distribution as well as the log-transformed distribution of the number of sequences in the total samples

Seamlessly scalable, interoperable, and extensible

https://github.com/hawaiidatascience/metaflowmics

Arisdakessian et al. (2020)

https://doi.org/10.1145/3311790.3396664

8 ASA3P An automatic pipeline used for assembly, annotation and higher level analysis of closely related bacterial isolates This pipeline conducts comprehensive genome characterizations and analyses like detection of antibiotic resistance gene, identification of virulence factors, and taxonomic classification (https://github.com/oschwengers/asap)

Schwengers et al. (2020)

https://doi.org/10.1371/journal.pcbi.1007134

9 SURPI Sequence-based ultrarapid pathogen identification This pipeline chelps to identify pathogen from complex NGS data generated from clinical samples. It provides extensive classification of reads against viral and bacterial databases in fast mode. SURPI pipeline consists of a set of fixed external software and database dependencies and user-defined custom parameters http:// chiulab.ucsf.edu/surpi

Naccache et al. (2014)

https://doi.org/10.1101/gr.171934.113

10 Diagno Top A computational pipeline for discriminating bacterial pathogens without database search This pipeline differentiates the spectral clusters found in top-down proteomics data sets that is been used for microbial diagnostics without database search. A promising tool for clinical microbiology and biomarker discovery http://patternlabforproteomics.org/diagnotop/

Lima et al. (2021)

https://doi.org/10.1021/jasms.1c00014

11 ViroMatch Computational pipeline for the detection of viral sequences from complex metagenomic data It is an automated pipeline where metagenomic sequences are screened for putative viral reads by neucleotide mapping and translated mapping https://github.com/twylie/viromatch

Wylie and Wyile (2021)

https://doi.org/10.1128/MRA.01468-20

12 V-pipe For assessment of viral genetic diversity from high-throughput sequencing data This computational pipeline is a combination of statistical models and computational tools for end-to end analyses of raw sequencing reads https:// https://github.com/cbg-ethz/V-pipe

Posada-Céspedes et al. (2021)

https://doi.org/10.1093/bioinformatics/btab015

13 IDseq Cloud-based pipeline for metagenomic pathogen detection and monitoring This pipeline is a cloud based metagenomics pipeline for pathogen detection and monitoring. It accepts raw mNGS data, exhibits host and quality filtration steps to finally result into reads and contigs for taxonomic categorization. It is specifically designed for detection of novel pathogens https://idseq.net

Katrina et al. (2020)

https://doi.org/10.1093/gigascience/giaa111

14 HAYSTAC This is based on novel Bayesian framework The pipeline High AccuracY and scalable Taxonomic Assignment of MetagenomiC data (HAYSTAC) is developed as a robust and rapid species identification from high throughput sequencing data. It can easily handle the ancient and modern DNA data and also the incomplete reference databases https://github.com/antonisdim/HAYSTAC

Dimopoulos et al. (2022)

https://doi.org/10.1371/journal.pcbi.1010493

15 SeqScreen This is for accurate and sensitive functional screening of pathogenic sequences This pipeline accurately characterize short nucleotide sequences by using taxonomic and functional labels and customized set of curated Functions of sequences of Concern (FunSoCs) specific to microbial pathogenesis. It is a combination of machine learling classifiers, alignment based tools, curated databases and curation-based labelling of protein sequences along with custom functions for accurate identification of pathogen www.gitlab.com/treangenlab/seqscreen

Balaji et al. (2022)

https://doi.org/10.1186/s13059-022-02695-x

16 VAPiD This is a portable and lightweight command line tool for annotation and GenBank deposition of viral genomes Tha pipeline VAPiD is developed to facilitate the viral genome annotations. It can handle batch submission of multiple viruses without prior knowledge of viral species, correctly annotates RNA editing and ribosomal and runs with simple one line command slippage, and handles submission of metadata https://github.com/rcs333/VAPiD

Shean et al. (2019)

https://doi.org/10.1186/s12859-019-2606-y