Table 2.
An overview of software tools for analysing long-read sequencing data
| Category | Tool name | Description | Ref. |
|---|---|---|---|
| De novo assembly | (Hi)Canu | Versatile de novo assembler | 23 |
| Flye | Fast de novo assembler that can also operate on low coverage data | 24 | |
| Shasta | Fast ONT assembler | 25 | |
| Falcon Unzip | PacBio assembler for phased assemblies | 22 | |
| Peregrine | Optimized assembler for HiFi data only | 128 | |
| hifiasm | Optimized assembler for HiFi data only | 139 | |
| PGAS | Phased assembly including strand seq | 46 | |
| Genomic alignment | LAST | Versatile method to align contigs or genomes | 57 |
| MUMmer | Long-standing genomic aligner | 87 | |
| minimap2 | Pairwise alignment method for long reads up to genomes | 58 | |
| Cactus | Progressive genomic alignment method allowing integration of more than two genomes at a time | 90 | |
| SibeliaZ | Fast genome aligner of multiple genomes | 140 | |
| Read alignment | minimap2 | Pairwise alignment method for long reads up to genomes | 58 |
| NGMLR | Convex gap cost implementation | 42 | |
| Winnowmap | Improvements for mapping in repetitive regions | 59 | |
| lra | Efficient convex-cost gap penalty sequence and contig aligner | 60 | |
| Graph genome methods | Giraffe | Rapid reads to graph aligner | 45 |
| vg | Toolkit to construct and convert graphs with methods to genotype and call variants | 96 | |
| minigraph | A sequence-to-graph mapper and graph constructor based on minimap2 | 97 | |
| GraphAligner | Sequence-to-graph aligner for long reads | 141 | |
| GraphTyper2 | Genotyping variants in a graph genome from short reads | 100 | |
| Paragraph | Genotyping structural variants in a regional graph genome from short reads | 101 | |
| PanGenie | k-mer-based genotyping of short reads in a haplotype-resolved graph | 99 | |
| Phasing | WhatsHap | Phasing method for SNVs and smaller indels | 15 |
| HapCut2 | Phasing method for SNVs | 16 | |
| SV calling from alignment | pbsv | Joint calling of SVs across samples | 62 |
| Sniffles | Automatic parameter estimation | 42 | |
| CuteSV | Highly parallelized SV calling | 63 | |
| SVIM | Uses graph-based clustering of candidates | 61 | |
| SV calling from assemblies | dipcall | Deletion and insertion calling from de novo assembly | 89 |
| SVIM-asm | SV calling from (diploid) de novo assembly | 142 | |
| PAV | Compares phased assemblies with a reference genome | 46 | |
| SNV calling | Clair | Uses a convolutional neural net | 69 |
| DeepVariant | Neural network-based SNV caller | 67 | |
| Longshot | Partitioning reads in haplotypes and calling variants in accordance with those haplotypes | 70 | |
| Pepper | Phasing-based SNV calling | 68 | |
| SV merging | SURVIVOR | Merging that allows breakpoint inaccuracies | 113 |
| SVanalyzer | Assembly based, two samples only | 98 | |
| Truvari | Parameterized stepwise merging including sequence similarity | 9 | |
| Jasmine | Merging SV based on sequence similarity | 32 | |
| SV genotyping | cuteSV | Force-calling of variants from a VCF file | 63 |
| Sniffles | Uses split reads to identify known SVs over shared breakpoints | 42 | |
| SVJedi | Compares the alignment of reads against the reference genome and alternative contigs representing the SV to determine the best match | 66 | |
| LRcaller | Genotypes variants of long reads | 11 | |
| Other | TRiCoLOR | Detects and genotypes repeat lengths separated by phase | 76 |
| Iris | Local assembly of insertions | 32 | |
| SVCollector | Optimized sample selection | 43 | |
| NanoComp | Comparison of sequencing data | 53 |
HiFi, high fidelity; indel, insertions–deletions; ONT, Oxford Nanopore Technologies; PacBio, Pacific Biosciences; SNV, single-nucleotide variant; SV, structural variant; VCF, variant call format.