Skip to main content
. 2015 Nov 2;16:352. doi: 10.1186/s12859-015-0806-7

Table 1.

Summary of search space per k-mer size and number of k-mers found in datasets

k-mer size # Canonical k-mer combinations % of k-mers found per sample % of k-mers found per sample, shared by at least two samples
Median MAD Median MAD
11-mer 2.10 × 1006 100.00 % 1.58 % 100.00 % 0.00 %
15-mer 5.40 × 1008 53.59 % 17.07 % 100.00 % 0.00 %
17-mer 8.60 × 1009 8.90 % 4.03 % 98.37 % 0.99 %
21-mer 2.20 × 1012 0.05 % 0.03 % 81.45 % 20.55 %
31-mer 2.30 × 1018 0.000000061 % 0.000000032 % 67.05 % 24.14 %

The second column contains the total number of possible k-mers, calculated as (4k-mer size/2), where the division by two is due to canonization. The third column is the median and the Median Absolute Deviation (MAD) of the total number of k-mers found in the samples (Additional file 3: Table S3) divided by the number of possible k-mers, showing the percentage of combinations actually found and, consequently, the saturation of the search space; the fourth column gives the median and MAD of the percentage of valid k-mers (k-mers shared between at least two samples, Additional file 3: Table S3)