Table 3.
Analysis | Commonly Used Tools |
---|---|
Common Analysis | |
Quality check of sequences | FastQC [90], FASTX-toolkit [91], MultiQC [92] |
Trimming of adaptors and low-quality bases | Trimmomatic [93], Cutadapt [94], fastp [95] |
Alignment of sequence reads to reference genome | BWA [96], Bowtie [97], dragMAP [98] |
Reports visualization | MultiQC [92] |
Whole-Genome Sequencing/Whole-Exome Sequencing/Targeted Panel | |
Removal of duplicate reads | Picard [99], Sambamba [100] |
Variant calling (single-nucleotide polymorphisms and indels) | GATK [101], freeBayes [102], Platypus [103], VarScan [104], DeepVariant [105], Illumina Dragen [106] |
Filter and merge variants | bcftools [107] |
Variant annotation | ANNOVAR [108], ensemblVEP [109], snpEff [110], NIRVANA [111] |
Structural variant calling | DELLY [112], Lumpy [113], Manta [114], GRIDDS [115], Wham [116], Pindel [117] |
Copy number variation (CNV) calling | CNVnator [118], GATK gCNV [119], cn.MOPS [120], cnvCapSeq(targeted sequencing) [121], ExomeDepth (CNVs from Exome) [122] |
Transcriptomics | |
Alignment of reads to reference | Splice-aware aligner such as TopHat2 [123], HISAT2 [124], and STAR [125] |
Transcript quantification | featureCounts [126], HTSeq-count [127], Salmon [128], Kallisto [129] |
Differential gene expression analysis enrichment of gene categories |
DESeq2 [130], EdgeR [131], DAVID [132], clusterProfiler [133], Enrichr [134] |
Epigenomics-Methyl Seq | |
Sequence aligners | Bwameth [135], BS-Seeker2 [136], Bismark [137] |
Methylation level quantification | MethylDackel * |
Differential methylation | Metilene [138], BSsmooth [139], methylKit [140] |
Epigenomics-ChIP seq | |
Removal of PCR duplicates | Samtools [107] |
Peak calling | MACS2 [141], SICER2 [142], SPP [143] |
Peak filtering | Bedtools [144] |
Enrichment quality control | ChipQC [145], Phantompeakqualtools [146] |
Enrichment comparison | diffBind [147], MAnorm [148], MMDiff [149] |
Motif analysis | MemeCHiP [150], Homer [151], RSAT [152] |
16s rRNA seq | |
16S rRNAseq analysis pipelines | QIIME2 [82], mothur [153], USEARCH [154] |
Ribosomal RNA databases | Greengenes [155], Silva [156], RDP [157] |
Shotgun Metagenomics | |
Taxonomic classification | MetaPhlAn4 [158], Kaiju [159], Kraken [160] |
Assembly of metagenomic reads | metaSPAdes [86], metaIDBA [87] |
Protein databases for taxonomic classification | NCBI non-redundant protein database [83] |
Gene annotation | Prokka [88], MetaGeneMark [89] |
Databases for functional annotation of genes | COG [161], KEGG [84], GO [85] |
Footnote: ANNOVAR—ANNOtate VARiation; BWA—Burrows Wheeler Aligner; cn.mops Copy Number Estimation by a Mixture Of PoissonS; COG—Clusters of Orthologous Groups of Proteins; DAVID—A Database for Annotation, Visualization and Integrated Discovery; Ensembl VEP—Ensembl Variant Effect Predictor; Fastp—Fsatq Preprocessor; GATK—Genome Analysis Tool Kit; GO—Gene Ontology; HISAT2—Hierarchical Indexing for Spliced Alignment of Transcripts; HOMER—Hypergeometric Optimization of Motif EnRichment; Htseq-count—High-Throughput Sequence Analysis in Python; KEGG: Kyoto Encyclopedia of Genes and Genomes; NCBI—National Center for Biotechnology Information; MACS: Model-Based Analysis for ChIP-Seq; MEME—Multiple EM for Motif Elicitation; Meta-IDBA—Meta-Iterative De Bruijn Graph De Novo Short-Read Assembler; MetaPhlAn—Metagenomic Phylogenetic Analysis; metaSPAdes—meta St Petersburg Genome Assembler; QIIME—Quantitative Insights Into Microbial Ecology; RDP—Ribosomal Database Project; RSAT—Regulatory Sequence Analysis tools; SICER—Spatial Clustering Approach for the Identification of ChIP-Enriched regions; SPP—The Signaling Pathways Project; STAR—Spliced Transcripts Alignment to a Reference. * Available at: https://github.com/dpryan79/MethylDackel/ (accessed on 1 June 2023). Bold represents the categories of analysis and commonly used bioinformatics tools used for NGS data analysis.