Skip to main content
. 2023 Dec 13;624(7992):602–610. doi: 10.1038/s41586-023-06842-7

Extended Data Fig. 9. Genomic variation impact on protein-coding genes.

Extended Data Fig. 9

(a) Bar plot shows the number of variants per individual impacting CDS exons of protein-coding genes. The horizontal dashed line indicates the average number of variants in CDS regions across the entire cohort. (b) Bar plots show the cumulative number of whole-genes contained within CNV regions sorted by size in increasing order for deletions and duplications. (c) Bar plots show for each CNV region, the number of different individuals that had a CNV identified within that region and were either intergenic or intersected whole genes. Different CNV regions are classified as singleton (light purple;1 individual), polymorphic (green; > 2 & <50% of individuals) and major (pink; ≥ 50% of individuals & <all individuals). (d) The bar plot shows the number of NCIG-only variants per LOEUF decile parsed by their level of distribution within NCIG communities. Variants were classified as private (n = 1 individual; blue), community-specific (n > 1 individual in 1 community; yellow), widespread (n > 1 individual in more than 1 community; red) or shared (n > 1 individual in all 4 communities; green) according to the number of communities in which they were identified. (e) Genome browser view shows sequencing alignments to ATXN3. A ‘CAG’ STR expansion, known to cause Machado-Joseph Disease (MJD), was identified in one NCIG-P2 individual. ONT reads span the expansion (left panel; purple markers indicate insertions). Illumina short-reads do not span the expansion, and are soft-clipped (right panel).