Table 1. Examples of 3 pairs of SNPs in 10 individuals (P1, …, P10) that illustrate theability of the maximum relationship Rij, and maximum CCCij, i e {A,a}, j e {B, b}, to capture more meaningful and robust SNP correlations compared to PCC and r2.
Genotype of 10
individuals |
|PCC| | r2 | max Rij | max CCCij |
||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | |||||
SNP 1 | Aa | AA | AA | AA | AA | AA | AA | AA | AA | AA | 0.1 | 0.0 | 0.9 | 0.5 |
SNP 2 | bb | Bb | bb | bb | bb | bb | bb | bb | bb | bb | ||||
| ||||||||||||||
SNP 3 | Aa | Aa | AA | AA | AA | AA | AA | AA | AA | AA | 0.7 | 0.5 | 0.9 | 0.6 |
SNP 4 | bb | Bb | bb | bb | bb | bb | bb | bb | bb | bb | ||||
| ||||||||||||||
SNP 5 | AA | AA | AA | AA | AA | aa | Aa | AA | aa | aa | 0.3 | 0.0 | 0.5 | 0.7 |
SNP 6 | bb | bb | bb | bb | bb | Bb | BB | Bb | BB | bb |
Note: For the first two pairs of SNPs, P3-P10 are perfectly matched with ‘AA’|‘bb’genotypes. The only difference between the two examples is that for individual P2, SNP 1 is ‘AA’ and SNP 3 is ‘Aa’. While the PCC and r2 values are highly sensitive to this small difference, CCC exhibits little sensitivity. In the third example, SNPs 5 and 6 are perfectly matched for P1-P5 in half of the individuals and uncorrelated in individuals P6-P10. PCC and r2 were overwhelmed by the lack of correlation in P6-P10 and returned low values. In contrast, CCC looked over the heterogeneity and correctly captured a high correlation value CCCAb for the ‘Ab’ relationship in P1-P5. Note that the CCC value for ‘Ab’ is higher for this example than it is for the first two pairs of SNPs, despite the fact that had more ‘AA’|‘bb’ individuals. This is because CCC adjusts for chance pairing due to varying allelic frequencies. In the second pair, the allele frequency is 0.90 for ‘A’ of SNP 3 and 0.95 for ‘b’ of SNP 4, which gives an expected frequency of 0.73 (0.90*0.90*0.95*0.95) for ‘AA’|‘bb’ just by chance paring. Since the observed frequency of ‘AA’|‘bb’ of 0.8 is only slighter greater than expected by chance, the CCCAb value is lower. However, in the third pair, where the expected frequency for ‘AA’|‘bb’ is 0.21 and the observed frequency is 0.50, the CCC gives a high CCCAb value owing to the excess of ‘AA’|‘bb’ pairing than expected by chance.Thus, CCC is able to capture meaningful correlations more accurately than PCC or r2.