Table 2.
Naive co-occurrence measures differ in how they score ineage-specific genes. With pairs of identical profiles over 30 species both Hamming distance and Pearson correlation constantly yield scores that indicate high similarity. In contrast, Fisher's exact test (two-tailed) and mutual information yield less significance with a narrower and broader than 50% species distributions.
presences | Hamming | Pearson | Fisher | mutual information |
---|---|---|---|---|
0/30 | 0 | — | 1 | 0 |
3/30 | 0 | 1 | 0.02 | 0.7 |
15/30 | 0 | 1 | 0.008 | 1 |
27/30 | 0 | 1 | 0.02 | 0.7 |
30/30 | 0 | — | 1 | 0 |