Table 1. Summary of genetic distances used to infer the relationships among samples.
Class | Distance | Description |
---|---|---|
Site-based | NUCmer | Suffix array method to efficiently perform pairwise whole-genome alignment |
Extended MLST | Employs the Basic Local Alignment Search Tool to perform pairwise comparisons of predicted open reading frames | |
k-mer based | Jaccard Distance | 1 –Jaccard index (i.e., the intersection divided by the union of all k-mers found between two samples) |
Manhattan Distance | Sum of the absolute differences between the abundance of each k-mer present between two samples | |
Euclidean Distance | The square root of the sum of square of all pairwise differences in k-mer abundance | |
Mash Distance | Employs the MinHash [23] technique to reduce genomes to sketches (i.e., a reduced representation of the information within the sequence data) and estimates a novel evolutionary distance metric among them | |
Mash Jaccard Distance | The Jaccard Distance (as described above) but based on the sketch size (e.g., the number of hashes) |