Skip to main content
. 2023 Dec 13;624(7992):602–610. doi: 10.1038/s41586-023-06842-7

Fig. 6. Landscape of STR expansions.

Fig. 6

a, Top, length distribution of all non-redundant TR and STR insertions (that is, expansions) detected across the cohort (n = 141), broken down by period size. Bottom left, number of non-redundant STR expansions in each period. Bottom centre, relative frequency of each period within CDS exons. Bottom right, proportional composition of each period based on genomic context. b, Frequency of every possible triplet (top) and pentanucleotide (bottom) motif among non-redundant STR expansions. Selected motifs known to cause different repeat disorders are highlighted. For triplet disorders, pathogenic motifs with gain-of-function mechanisms are shown in red, those with loss-of-function mechanisms are shown in blue. c, Left, normalized standard deviation (range 0–1) of allele sizes observed within each community group (matrix columns) for different STR sites (matrix rows). All expanded sites of period ≥3 bp within protein-coding genes, in which allelic composition was significantly different between groups are shown (one-way ANOVA, P < 0.05). Hierarchical clustering groups STR sites on the basis of patterns of variability between groups. c, Right, all STR alleles (two per individual) for three example sites showing distinct patterns of variation. BLOC1S2 (top) has an intronic STR with higher allelic diversity in NCIG versus non-NCIG individuals. AC012531 (middle) has lower allelic diversity in NCIG versus non-NCIG individuals. PRRC2B (bottom) shows community-specific patterns, with NCIG-P1 and NCIG-P4 exhibiting heterogeneity.