Skip to main content
. 2020 Jul 20;9:e56523. doi: 10.7554/eLife.56523

Figure 4. DAZL binds a UGUU(U/A) motif within 3' UTRs.

(A) De novo motif discovery from replicated DAZL peaks in 3' UTR exons. Motif analyses were carried out with HOMER and MEME tools using crosslinked peaks ± 10 nucleotides, with all expressed 3' UTRs (TPM ≥1) as background. The top three ranked motifs identified via HOMER are shown. One statistically significant motif was identified via MEME. (B) GUU-centered motif analysis of replicated peaks in 3' UTRs via kpLogo. For each crosslinked peak ±10 nucleotides, the closest GUU was identified, and all sequences were aligned along the GUU. Background was a subset of unbound GUUs randomly selected sequences from the full-length 3' UTRs that contain DAZL peaks. As P values are extremely small (<1×10−308), residues are scaled by test statistics. (C) Position of all UGUU(U/A), GUU, and UUU motifs relative to crosslinked nucleotides from replicated peaks in 3' UTRs. 0 represents the crosslinked nucleotide. Enrichment was identified relative to randomly selected sequences from the full-length 3' UTRs that contain DAZL peaks. (D) Conservation of DAZL binding sites across vertebrates based on phyloP and phastCons scores. DAZL-bound nucleotides identified via iCLIP were compared with unbound nucleotides from the same 3' UTRs (two-sided Mann-Whitney U test). (E) DAZL’s 3' UTR binding site in Celf1 is conserved among vertebrates. Blue shading highlights nucleotides that reflect the consensus. Bold designates DAZL’s UGUU(U/A) motif. Asterisk marks crosslinked nucleotide in DAZL iCLIP data. Sequence shown is absent from coelacanth. (F) Frequency of DAZL binding sites per DAZL-bound transcript. The majority of DAZL targets have more than one DAZL binding site (those targets to the right of the vertical dashed line). (G) Distance between adjacent DAZL binding sites in 1,649 DAZL-bound transcripts with more than one DAZL binding site. (H) Relative position of DAZL binding sites along the 3' UTR. The start and end of the 3' UTR were designated as 0 and 1, respectively. DAZL binding sites are enriched at the end and, to a lesser extent, at the start, relative to randomly selected sites in the same 3' UTRs (dashed line) (two-sided Kolmogorov-Smirnov test). (I) Absolute position of DAZL binding sites along the 3' UTR. DAZL binding sites exhibit a sharp accumulation 20–100 nucleotides from the end of the 3' UTR and a broader accumulation 100–240 nucleotides from the start of the 3' UTR relative to randomly selected positions within the same 3' UTRs (dashed line) (two-sided Kolmogorov-Smirnov tests). ****, p<0.0001.

Figure 4—source data 1. Characterization of DAZL binding within 3' UTR.

Figure 4.

Figure 4—figure supplement 1. DAZL iCLIP motif enrichment and conservation.

Figure 4—figure supplement 1.

(A) Position of UGUU(U/A) and other motifs relative to crosslinked nucleotides from replicated peaks in 3' UTRs. 0 represents the crosslinked nucleotide. Enrichment was identified relative to randomly selected sequences from the full-length 3' UTRs bound by DAZL. Left panel: UGUU and GUU(U/A) represent truncations of the UGUU(U/A) motif. UGUU was also previously identified as a DAZL motif in one study (Li et al., 2019). Right panel: previously reported DAZL motifs GUUG (Zagore et al., 2018), GUUC (Maegawa et al., 2002; Reynolds et al., 2005), and UUU(C/G)UUU (Chen et al., 2011). (B) Percentage of DAZL binding sites with GUU, UGUU(U/A) and other motifs. Dashed line for each motif represents expected percentage calculated using the nucleotide frequency within 3' UTRs of DAZL-bound transcripts. (C) Enrichment of specific motifs at replicated 3' UTR peaks from an independent DAZL iCLIP dataset from P6 testes (Zagore et al., 2018). Dataset was reanalyzed using our computational pipeline. AME from the MEME Suite (McLeay and Bailey, 2010) was used to identify the enrichment of each motif at crosslinked nucleotides in replicated peaks relative to shuffled control sequences. P value was adjusted with Bonferroni correction. (D) GUU and UGUU(U/A) motif enrichment at replicated peaks from expressed transcripts (TPM ≥1) from each type of genomic region. Analysis was carried out as described in C. P value was adjusted with Bonferroni correction to account for multiple testing. (E) DAZL’s 3' UTR binding sites in Lin28a are conserved among vertebrates. Blue highlights nucleotides conserved across all sequences. Bold designates DAZL’s UGUU(U/A) motif. Asterisks mark crosslinked nucleotides in DAZL iCLIP data. Lin28a sequence is absent from opossum, frog, and zebrafish. (F) DAZL’s 3' UTR binding sites in Ep300 are conserved among vertebrates. Formatting as described in D. Ep300 sequence is absent from coelacanth.
Figure 4—figure supplement 2. DAZL binding along the 3' UTR.

Figure 4—figure supplement 2.

(A) Gene Set Enrichment Analysis (GSEA) enrichment scores showing the degree to which gene sets are overrepresented among DAZL targets with more binding sites, relative to all targets. The gene set for ‘undifferentiated spermatogonia’ is listed in Figure 2—source data 1. The gene sets ‘mRNA splicing, via spliceosome’ and ‘transcription by RNA polymerase II’ are from Gene Ontology (GO) terms. Adjusted P values shown. (B) Relationship between number of DAZL binding sites and transcript abundance. (C) Relationship between number of DAZL binding sites and 3' UTR length. (D) Relationship between number of DAZL binding sites and number of UGUU(U/A) motifs in the 3' UTR. (E) Relative positions of DAZL binding site density along the 3' UTR. The density of DAZL binding sites was statistically distinct from the density of all UGUU(U/A) motifs within DAZL-bound 3' UTRs (dashed line) (two-sided Kolmogorov-Smirnov test). (F) Absolute position of DAZL binding site density along the 3' UTR. The densities of DAZL binding sites were statistically distinct from the densities of all UGUU(U/A) motifs within DAZL-bound 3' UTRs (dashed lines) (two-sided Kolmogorov-Smirnov tests). ***, p<0.001, ****, p<0.0001.