Skip to main content
. 2017 Oct 26;6:e28383. doi: 10.7554/eLife.28383

Figure 2. Human PRDM9 can bind promoters, though recombination is suppressed.

(a) The chance-corrected proportion of protein-coding genes that have a PRDM9 peak center occurring within 500 bp of the TSS, stratified by different PRDM9 enrichment value thresholds (shades of green, with thresholds listed), in each quartile of force-called H3K4me3 enrichment surrounding the TSS in untransfected cells. The power to detect weaker binding events increases at more active promoters (as measured by H3K4me3), though strong PRDM9 binding events appear at roughly 10% of all promoters regardless of activity. (b) Barplot illustrating the proportion of promoter or non-promoter PRDM9 peaks assigned to each of the 7 motifs (or no motif, in gray). Motif 7 appears 2-fold enriched in promoter peaks. (c) Mean HapMap CEU recombination rates are reported for promoter (pink squares) and non-promoter (gray circles) human PRDM9 peaks split into quartiles of PRDM9 enrichment (filtered not to overlap repeats or occur within 15 Mb of a telomere; error bars represent two standard errors of the mean). Both median enrichment values and recombination rates are greater for non-promoter peaks, even in overlapping ranges of PRDM9 enrichment. (d) Mean recombination rate in 20 kb windows centered on bound motifs, for promoter (pink) and non-promoter (gray) peaks further filtered only to include peaks with PRDM9 enrichment values between 1 and 2 (smoothing: ksmooth bandwidth 200). (e) Mean H3K36me3 enrichment in transfected cells divided by mean H3K36me3 enrichment in untransfected cells at 36,000 non-promoter PRDM9 binding sites split into quartiles of PRDM9 enrichment (shades of purple). (f) same as e but for 10,000 promoter PRDM9 binding sites split into quartiles of PRDM9 enrichment. (g) The absolute mean enrichment values used to generate plots e and f, split into transfected (solid) and untransfected (dotted) samples at promoter (pink) and non-promoter (gray) PRDM9 binding sites in the top quartile of PRDM9 enrichment. There is a depletion of H3K36me3 coverage surrounding promoters in untransfected cells, but the magnitude of this depletion decreases in transfected cells. (h) At 4,000 protein-coding genes with a strong PRDM9 binding peak within 500 bp of the TSS (PRDM9 enrichment >2 and <10), we show the relationship between force-called H3K4me3 enrichment and force-called H3K36me3 enrichment in the 1 kb surrounding each TSS, for both transfected and untransfected cells (solid and dotted lines). Error bars indicate two standard errors of the mean H3K36me3 enrichment within each quintile of H3K4me3 enrichment. H3K36me3 enrichment increases in transfected cells at all strongly bound promoters, but this effect diminishes almost to 0 as promoter activity increases (which forces H3K36me3 close to 0 in all cells). This effect cannot be accounted for by the modest decrease in PRDM9 enrichment at more active promoters (mean PRDM9 enrichment decreases from 4.3 in the first H3K4me3 quintile to 3.1 in the fifth quintile).

Figure 2.

Figure 2—figure supplement 1. Chimp w11a PRDM9 binds a T-rich motif away from human binding sites.

Figure 2—figure supplement 1.

(a) Comparison of the number of human (B allele; blue) and chimp (w11a allele; semitransparent green) PRDM9 ChIP-seq peaks in 1 Mb bins across human Chr1. Consistent with their different predicted binding motifs, we observe very different binding landscapes across the genome. (b) The proportion of chimp PRDM9 peaks with a human PRDM9 peak center occurring within 500 bp, for each decile of chimp PRDM9 enrichment. Overall, chimp and human peaks overlap less than expected by chance (gray line), especially for stronger chimp peaks. (c) Top: the only non-degenerate motif returned by our motif-finding algorithm when trained on the top 5,000 chimp PRDM9 peaks ranked by enrichment, representing the first empirically determined direct binding motif for any chimp PRDM9 allele. Although the motif extends 53 bp (the expected span of the zinc fingers), only a 17 bp core region shows high specificity, and this region overlaps and matches in-silico binding predictions Persikov et al., 2009; Persikov and Singh, 2014; Schwartz et al., 2014; Auton et al., 2012), in particular a submotif (highlighted in yellow) shown to be common to many western chimp PRDM9 alleles (reproduced from Schwartz et al., 2014 Supplementary Material, Supplemental Figure 2, available under the terms of the https://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution License). Zinc-finger residues at DNA-contacting positions (labeled −1, 3, 6) are illustrated below each zinc-finger position, classified by polarity, charge, and presence of aromatic side chains. In contrast to the human B allele, this chimp allele has 18 instead of 12 canonical zinc fingers, and they differ in amino acid types at the DNA-contacting positions.
Figure 2—figure supplement 2. Human PRDM9 can bind promoters, though DSBs do not occur.

Figure 2—figure supplement 2.

(a) FIMO was used to identify the top 1 million matches for Motif 1 in hg19 (Bailey et al., 2015. For 0.1 percentile bins of increasing FIMO score, the proportion of motif matches occurring within 150 bp of a PRDM9 peak center is plotted (p<10−6, minsep 250). Even the strongest 0.1% of motif matches are only bound 50% of the time. (b) PRDM9 peaks overlapping Motif 1 (and having more than 5 input reads overlapping the peak center) were divided into those overlapping promoters (stringently, those within 1 kb of a TSS, overlapping an H3K4me3 peak in untransfected cells, and overlapping a DNase HS site; gray), and non-promoters (failing those criteria and further not overlapping an H3K4me3 peak reported by any ENCODE data; see Materials and methods; pink). Mean raw input coverage values are plotted in decile bins of FIMO score, with error bars representing ±2 s.e.m. (c,d) Same as b, but with mean sum of raw ChIP fragment coverage values in each bin (c) or mean computed enrichment values in each bin (d). Overall, promoters show greater input sequencing coverage and thus we have greater power to detect weak binding in these regions. When corrected for this sequencing bias, we see that promoter binding sites tend to have weaker binding enrichment for a given FIMO score. (e) Mean force-called testis DMC1 enrichment values (Pratto et al., 2014) are reported for promoter (pink squares) and non-promoter (gray circles) human PRDM9 peaks split into quartiles of PRDM9 enrichment (filtered to not overlap repeats or occur within 15 Mb of a telomere; error bars represent two standard errors of the mean). Both median PRDM9 enrichment values and DMC1 enrichment values are greater for non-promoter peaks, even in overlapping ranges of PRDM9 enrichment, as observed for LD-based recombination rate estimates (Figure 2c). (f) Mean raw testis DMC1 coverage in 20 kb windows centered on bound motifs, for promoter (pink) and non-promoter (gray) peaks further filtered only to include peaks with PRDM9 enrichment values between 1 and 2 (smoothing: ksmooth bandwidth 200). There is a local spike in DMC1 coverage around non-promoter binding sites, similar to the peak observed in LD-based recombination rate (Figure 2d), and an apparent depletion of DMC1 coverage at promoter binding sites. .
Figure 2—figure supplement 3. ATAC-seq profiles showing nucleosome phasing around PRDM9 binding sites.

Figure 2—figure supplement 3.

(a) ATAC-seq profile plots surrounding a set of the ~15,000 strongest human PRDM9 ChIP-seq peaks (filtered to require a motif match and to not overlap an annotated DNase hypersensitive site) in cells transfected with human PRDM9. ‘Coverage’ here refers to the frequency with which an ATAC-seq fragment center occurs at each position, such that 'Nuc.-free' coverage tracks the centers of nucleosome-depleted regions, and ‘MonoNuc.’ coverage tracks the centers of single nucleosomes. Coverage values are normalized to the mean values observed between 1,500 and 3,000 bases away from each site, as a measure of background, and smoothed (ksmooth bandwidth = 50). The human-transfected cells show strongly phased nucleosomes centered at ~100 bp to either side of the motif and an elevated signature of nucleosome depletion at the center. (b-d) The same plot at the same sites as a but for cells transfected with the zinc-finger domain alone (b), untransfected cells (c), or cells transfected with a truncated construct excluding the ZF domain (d). These plots show that the ZF domain is insufficient to phase nucleosomes and confirm that PRDM9 binding favors nucleosome-depleted regions. They also confirm that PRDM9 creates a nucleosome-depleted region near its binding site (Baker et al., 2014).
Figure 2—figure supplement 4. PRDM9s ZF domain is necessary and sufficient for nuclear localization.

Figure 2—figure supplement 4.

Representative results of ImmunoFluorescence detection of V5 tags in HEK293T cells transfected with full-length human PRDM9 (first column), the ZF domain alone (‘ZF-only’, second column), or everything but the ZF domain (‘no-ZF’, third column). Grayscale images show fluorescence intensities of DAPI to mark the nucleus (first row), or of the anti-V5 antibody to mark transfected protein localization (second row). False-colored merged images (DAPI blue, anti-V5 red) show that the full-length and ZF-only proteins are restricted to the nucleus, while the no-ZF construct localizes throughout the cell. Note these images are not representative of the overall transfection efficiencies of these constructs.