The Relationship between Promoter CpG Density (o/e CpG Ratio) and Downstream Gene Loss-of-Function Intolerance
(A) The distribution of genic LOEUF (as provided by gnomAD) in each decile of promoter CpG density. The vertical line corresponds to the cutoff for highly LoF-intolerant genes (LOEUF < 0.35).
(B) Odds ratios and the corresponding 95% confidence intervals, quantifying the enrichment for highly LoF-intolerant genes (LOEUF < 0.35) that is exhibited by the set of genes in each decile of promoter CpG density. For each of the other deciles, the enrichment is computed against the 10th decile. The horizontal line corresponds to zero enrichment.
In both (A) and (B), CpG density deciles are labeled from 1-10 with 1 being the most CpG-poor and 10 the most CpG-rich decile.
(C) The percentage of LOEUF variance (adjusted r2) explained by CpG density, computed in 1 kb windows. Each point corresponds to a window. We start with a window centered at 2 kb upstream of the TSS, and slide it in 250 bp steps in the 5′-to-3′ direction, until the final window is centered at 2 kb downstream. Red and pink points correspond to intervals entirely upstream or downstream, respectively, of the TSS, with squares indicating intervals extending beyond 2 kb. Orange points correspond to intervals containing both upstream and downstream sequence.