(
A) Position of UGUU(U/A) and other motifs relative to crosslinked nucleotides from replicated peaks in 3' UTRs. 0 represents the crosslinked nucleotide. Enrichment was identified relative to randomly selected sequences from the full-length 3' UTRs bound by DAZL. Left panel: UGUU and GUU(U/A) represent truncations of the UGUU(U/A) motif. UGUU was also previously identified as a DAZL motif in one study (
Li et al., 2019). Right panel: previously reported DAZL motifs GUUG (
Zagore et al., 2018), GUUC (
Maegawa et al., 2002;
Reynolds et al., 2005), and UUU(C/G)UUU (
Chen et al., 2011). (
B) Percentage of DAZL binding sites with GUU, UGUU(U/A) and other motifs. Dashed line for each motif represents expected percentage calculated using the nucleotide frequency within 3' UTRs of DAZL-bound transcripts. (
C) Enrichment of specific motifs at replicated 3' UTR peaks from an independent DAZL iCLIP dataset from P6 testes (
Zagore et al., 2018). Dataset was reanalyzed using our computational pipeline. AME from the MEME Suite (
McLeay and Bailey, 2010) was used to identify the enrichment of each motif at crosslinked nucleotides in replicated peaks relative to shuffled control sequences.
P value was adjusted with Bonferroni correction. (
D) GUU and UGUU(U/A) motif enrichment at replicated peaks from expressed transcripts (TPM ≥1) from each type of genomic region. Analysis was carried out as described in C.
P value was adjusted with Bonferroni correction to account for multiple testing. (
E) DAZL’s 3' UTR binding sites in
Lin28a are conserved among vertebrates. Blue highlights nucleotides conserved across all sequences. Bold designates DAZL’s UGUU(U/A) motif. Asterisks mark crosslinked nucleotides in DAZL iCLIP data.
Lin28a sequence is absent from opossum, frog, and zebrafish. (
F) DAZL’s 3' UTR binding sites in
Ep300 are conserved among vertebrates. Formatting as described in D.
Ep300 sequence is absent from coelacanth.