Skip to main content
. 2010 Mar 30;26(9):1246–1253. doi: 10.1093/bioinformatics/btq129

Table 3.

Feature contributions for the similarity metric

Features A P R F1 ΔF1
Sim (ch) 0.963 0.879 0.895 0.887
Sim (wd) 0.937 0.844 0.747 0.793
Sim (ch + wd) 0.962 0.877 0.890 0.884
Levenshtein 0.939 0.849 0.754 0.799
Jaro–Winkler 0.918 0.920 0.534 0.676
SoftTFIDF 0.921 0.817 0.656 0.728

Full 0.965 0.883 0.900 0.892

- Sim (ch) 0.947 0.855 0.808 0.831 −0.061
- Sim (wd) 0.965 0.885 0.898 0.891 −0.001
- Sim (ch+wd) 0.950 0.868 0.810 0.838 −0.054
- Levenshtein 0.965 0.882 0.898 0.890 −0.002
- Jaro–Winkler 0.965 0.882 0.899 0.891 −0.001
- SoftTFIDF 0.965 0.882 0.901 0.892 −0.000

ΔF1, the difference of F1 score from the Full feature set; Sim (ch), character n-gram similarity; Sim (wd), word n-gram similarity