Table 7. Error analysis: average feature similarity for error cases on Naïve Bayes.
Caenorhabditis | Danio rerio | Drosophila | Escherichia coli | Zea mays | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Feature | FP | FN | FP | FN | FP | FN | FP | FN | FP | FN |
#Instances | 1644 | 72 | 2167 | 39 | 13879 | 4844 | 161 | 9 | 390 | 66 |
Description | 0.322 | 0.320 | 0.293 | 0.372 | 0.250 | 0.515 | 0.147 | 0.172 | 0.216 | 0.428 |
Literature | 0.115 | 0.027 | 0.440 | 0.243 | 0.031 | 0.471 | 0.003 | 0.000 | 0.013 | 0.232 |
Length | 0.191 | 0.567 | 0.165 | 0.659 | 0.143 | 0.704 | 0.151 | 0.556 | 0.207 | 0.720 |
Identity | 0.936 | 0.902 | 0.954 | 0.902 | 0.974 | 0.854 | 0.983 | 0.924 | 0.962 | 0.866 |
AP | 0.015 | 0.018 | 0.008 | 0.032 | 0.027 | 0.060 | 0.037 | 0.167 | 0.054 | 0.277 |
Expect_Value | 0.012 | 0.109 | 0.019 | 0.031 | 0.168 | 0.365 | 0.037 | 0.020 | 0.055 | 0.001 |
CDS_Identity | 0.881 | 0.882 | 0.924 | 0.888 | 0.893 | 0.852 | 0.906 | 0.921 | 0.868 | 0.840 |
CDS_AP | 0.018 | 0.022 | 0.006 | 0.032 | 0.020 | 0.072 | 0.022 | 0.146 | 0.009 | 0.413 |
CDS_Expect | 0.458 | 0.348 | 0.596 | 0.299 | 1.126 | 0.36 | 0.753 | 0.589 | 0.614 | 0.056 |
TRS_Identity | 0.403 | 0.512 | 0.392 | 0.345 | 0.426 | 0.424 | 0.430 | 0.548 | 0.540 | 0.840 |
TRS_AP | 0.020 | 0.042 | 0.020 | 0.408 | 0.032 | 0.130 | 0.030 | 0.262 | 0.027 | 0.463 |
TRS_Expect | 2.456 | 1.312 | 1.630 | 0.408 | 2.061 | 1.404 | 1.799 | 0.144 | 3.227 | 0.257 |
#Instances: number of instances; FP: false positives, distinct pairs classified as duplicates; FN: false negatives, duplicates classified as distinct pairs; Feature names are explained in Table 3; Numbers are averages, excluding pairs not have specific features.