Table 2.
Naive co-occurrence measures differ in how they score ineage-specific genes. With pairs of identical profiles over 30 species both Hamming distance and Pearson correlation constantly yield scores that indicate high similarity. In contrast, Fisher's exact test (two-tailed) and mutual information yield less significance with a narrower and broader than 50% species distributions.
| presences | Hamming | Pearson | Fisher | mutual information |
|---|---|---|---|---|
| 0/30 | 0 | — | 1 | 0 |
| 3/30 | 0 | 1 | 0.02 | 0.7 |
| 15/30 | 0 | 1 | 0.008 | 1 |
| 27/30 | 0 | 1 | 0.02 | 0.7 |
| 30/30 | 0 | — | 1 | 0 |