Figure 3.
The uniqueness of k-mers measured by fuzzy search. (A) Example of summary output. The summary output displays each k-mer sequence in sequential order with counts of k-mers at each edit distance up to 2 mismatches, along with a plot that visually displays those counts. (B) Example of detailed output. Identical nucleotides are indicated by a dot (.) and different nucleotides are shown at their positions. The example search result showed that the k-mer sequences from the first 5 positions are unique within 2 mismatches while the sequence at the 6th position has 5 neighbor sequences with 2 mismatches. For instance, the 25-mer at the 6th position is identical to the 25-mer at chr6:137348026–137348050 except for two mismatches: i) A instead of T at the 3rd base position and ii) G instead of A at the 19th base position. (C) Example k-mers with unique exact matches identified by KmerKeys but not found by the web versions of BLAT/BLAST. (D) Example of detailed output from a KmerKeys web application query of 25-mers in gnomAD v2 in an intronic region of TP53.
