TABLE 4.
Analysis tool (reference[s]) | Concept | Method | Run time (h) | Topology score (%) | Web address(es) | Input type(s) | Input format(s) | Output format(s) |
---|---|---|---|---|---|---|---|---|
Web based | ||||||||
PubMLST (158) | Web-accessible database where it is possible to run cgMLST and wgMLST analyses | cgMLST/wgMLST | NA | NA | https://pubmlst.org/ | Contigs | FASTA | cgMLST/wgMLST profile |
CSI Phylogeny 1.4 (161) | High-quality SNP method using reference mapping of reads and mapping and SNP calling assessments | Reference-based SNP | ND | ND | https://cge.cbs.dtu.dk/services/CSIPhylogeny/ | Raw sequences, contigs | FASTA, FASTQ | ND |
NDtree 1.2 (161) | Creates k-mers of reads and maps them to a reference; performs simple model to determine no. of SNPs | Statistical method | 3–3.5b | ND | https://cge.cbs.dtu.dk/services/NDtree/ | Raw sequences | FASTQ | Newick |
Command line | ||||||||
kSNP3 (154, 155) | Uses k-mer analyses to detect SNPs between strains without using either multiple-sequence alignment or a reference genome | Non-reference-based SNP | 0.5c | 91.80–95.80c,e | https://sourceforge.net/projects/ksnp/ | Raw sequences, contigs | FASTA | Newick, MSA |
Roary (169) | Tool for constructing pangenomes from contigs | Pangenome | 4.30d | 100d | https://sanger-pathogens.github.io/Roary/ | Contigs | GFF3 | FASTA, TXT, CSV, Rtab |
Pan-Seqf (175) | Pangenome assembler with additional locus finder for core/accessory gene allele profiles (a Web-based version is also available) | Pangenome | ND | ND | https://github.com/chadlaing/Panseq, https://lfz.corefacility.ca/panseq/ | Contigs | FASTA | TXT, FASTA |
Lyve-SET (179) | High-quality SNP method using reference mapping of reads and mapping and SNP calling assessments | Reference-based SNP | 6.25c | 85c | https://github.com/lskatz/lyve-SET | Raw sequences, contigsg | FASTA, FASTQ | Matrix, FASTA, Newick, VCF |
SPANDx (182) | Complete workflow for creating SNP/indel matrixes as well as locus presence/absence matrixes from raw sequencing reads from a range of NGS technologies | Reference-based SNP | 3.1c | 100c | https://sourceforge.net/projects/spandx/ | Raw sequences | FASTA, FASTQ | NEXUS |
All quantitative performance measures were taken from previously reported data, as indicated. ND, no data; NA, not applicable; MSA, multiple-sequence alignment; GFF3, General Feature Format 3; VCF, variant call format.
Based on 46 VTEC genomes (20).
Based on 21 E. coli genomes (167).
Wall time for 1,000 S. enterica serovar Typhi genomes (169).
Using core.
A Web-based version is also available.
Contigs are simulated to reads.