Table 1. Summary and genomic localization of detected variants in the CAO.
SNPs | Insertions | Deletions | Total | |||
---|---|---|---|---|---|---|
Short | Long | Short | Long | |||
In gene | 1362 (85.23%) | 48 (73.85%) | 28 (63.64%) | 65 (73.86%) | 11 (100%) | 1514 |
Intergenic | 202 (12.64%) | 13 (20%) | 13 (29.54%) | 19 (21.59%) | - | 247 |
In pseudogene | 34 (2.13%) | 4 (6.15%) | 3 (6.82%) | 4 (4.55%) | - | 45 |
Total | 1598 | 65 | 44 | 88 | 11 | 1806 |
Parsimony informative | 434 (27.2%) | 18 (27.7%) | 14 (31.8%) | 16 (18.2%) | 4 (36.4%) | 486 |
Compatible | 394 (90.8%) | 11 (61.1%) | 10 (71.4%) | 13 (81.2%) | 2 (50%) | 428 |
Percentages were calculated based on the total number of variants in each variant class, except for the compatible variants, where the percentage was calculated based on the number of parsimony informative variants in each variant class. The majority of SNPs and indels are found in coding regions. However, the distribution of variants in protein-coding and intergenic regions differs significantly from the expectation by chance (Fisher’s exact test, p<0.01), where variants (247, 13.7%) are enriched in intergenic regions that span 7.8% of the genome.