Skip to main content
. 2002 Apr;12(4):656–664. doi: 10.1101/gr.229202

Table 3.

Sensitivity and Specificity of Single Perfect Nucleotide K-mer Matches as a Search Criterion

7 8 9 10 11 12 13 14









A. 81% 0.974 0.915 0.833 0.726 0.607 0.486 0.373 0.314
83% 0.988 0.953 0.897 0.815 0.711 0.595 0.478 0.415
85% 0.996 0.978 0.945 0.888 0.808 0.707 0.594 0.532
87% 0.999 0.992 0.975 0.942 0.888 0.811 0.714 0.659
89% 1.000 0.998 0.991 0.976 0.946 0.897 0.824 0.782
91% 1.000 1.000 0.998 0.993 0.981 0.956 0.912 0.886
93% 1.000 1.000 1.000 0.999 0.995 0.987 0.968 0.957
95% 1.000 1.000 1.000 1.000 0.999 0.998 0.994 0.991
97% 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999









B. K 7 8 9 10 11 12 13 14
F 1.3e+07 2.9e+06 635783 143051 32512 7451 1719 399

(A) Columns are for K sizes of 7–14. Rows represent various percentage identities between the homologous sequences. The table entries show the fraction of homologies detected as calculated from equation 3 assuming a homologous region of 100 bases. The larger the value of K, the fewer homologies are detected. 

(B) K represents the size of the perfect match. F shows how many perfect matches of this size expected to occur by chance according to equation 4 in a genome of 3 billion bases using a query of 500 bases.