. 2024 May 16;30:1611676. doi: 10.3389/pore.2024.1611676

TABLE 1.

Summary of the most recent and common-used long-read bioinformatics tools.

Long-read bioinformatics tools
	Data analysis step	Tool name	Background and performance	References
Complex user-friendly interfaces capable of perform the whole analysis process exept error correction: PacBio: SMRT link (BioSciences) Nanopore: EPI2ME Labs (Nanopore)	QC metrics	FastQC, MultiQC, LongQC, NanoPack, MinIONQC, NanoR, RNASeQC	The listed items are quality control (QC) tools suitable for sequencing approaches, including long- and short-reads. Their aim is to provide QC checks on raw sequence data (FastQC) or dataset (MultiQC) and give detailed feedback regarding the occurring problems. For RNA-seq data, an unique algorithm (RNA-SeQC) was developed	[47–54]
	Base calling	SMRT analysis tools, Dorado, Guppy	Neural network and statistical method based base calling methods; SMRT reads require specific analysis tools. Dorado and Guppy were developed for NS reads	[55–57]
	Variant calling	Clair3, Sniffles	Sniffles perform structural variant calling on noisy long-read data. Clair3 is a deep neural network based variant caller even capable of haplotype-sensitive variant detecion performing variant detection from sequencing data containing modified bases	[58–60]
	Variant calling	wf-human-variation, wf-somatic-variation	Complex command line compatible workflows for NS variant detection. On demand, the separate or combined usage of tumor and normal data is insured with the production of well-detailed analysis reports	[61]
	Modified base calling	Modbamtools, Guppy, Mekada, DeepSignal, DeepMod	Set of tools to manipulate and visualize DNA/RNA base modification and methylation data that are stored in.bam format. Some of them is suitable for all long-read techniques. The detectable modified bases are 5mC, 5hmC and 6 mA	[33, 57–59, 62, 63]
	Genome assembly	Flye, Canu, HiCanu, BLASR, FALCON	Some of them are graph construction-based method (Flye) or using hierarchical genome assembly process with clustering (BLASR) and overlap-based error correction, also carry out phasing (FALCON) during the accomplishment of de novo genome assembly on high-noise single-molecule sequencing data	[64–68]
	Visualization	NanoPack, R packages: maftools, ggplot2, Python packages: matplotlib (pyVolcano)	Packages offering universal and problem-specific solutions for long-read data visualization	[50, 69–72]
	Error correction	Pilon, Racon, DeepConsensus, Medaka	Neural network- and transformer-based methods, which are intended as standalone modules to correct raw contigs generated by rapid assembly methods which include or do not include a consensus step. An advantage of the application of transformer-based error correction methods is that they leverage a unique alignment loss to correct sequencing errors	[33, 35, 71]

Additional packages are listed on webpage https://long-read-tools.org and can be found on bioinformatics-related pages.