Figure 3.
Regulatory variants in repetitive sequences have higher MPRA activity
(A) Number of regulatory variants within repeats. Asterisks indicate significant enrichments (BH adjusted p value determined by a Pearson’s chi-square test). OR, odds ratio.
(B) MPRA activities of variants in repetitive sequences (rep. seq.) or unique sequences for all variants (left), active variants (middle), and regulatory variants (right) in CMs (top) and VSCMs (bottom). Number of observations per group are written below each boxplot. Asterisks indicate significant differences between distributions determined by two-sided Mann-Whitney tests at p < 0.05. P values are depicted above plots. Box limits represent upper and lower quartiles. Central boxplot line represents the median and whiskers represent 1.5× IQR. Points represent outliers. These comparisons are considered to be independent. Therefore, multiple-testing correction was not applied.
(C) Total numbers of predicted TFBSs located in repetitive regions vs. non-repetitive regions (unique seq.): other variants for all MPRA-tested variants (left), regulatory sequences in CMs (middle), and regulatory sequences in VSMCs (right). Asterisks denote significance (∗∗two-sided Mann-Whitney tests at p < 0.05). Box limits represent upper and lower quartiles. Central boxplot line represents the median and whiskers represent 1.5× IQR. Points represent outliers. These comparisons are considered to be independent. Therefore, multiple-testing correction was not applied.
(D) Colored dots denote significant enrichment of predicted TFBSs across variants in repetitive sequences (hypergeometric testing, FDR < 0.05).
(E) Examples of two zinc-finger TFs harboring KRAB domains that act as repressors (left and middle, CMs; right, VSMCs). Numbers in plots represent the total number of regulatory variants with a disrupted TFBS for the respective TF and p values. Box limits represent upper and lower quartiles. Central boxplot line represents the median and whiskers represent 1.5× IQR. See also Figure S4; Tables S5 and S6.
