Figure 3: Enhancers and promoters display differential motif complexity.
(A) PCA embeddings of internal model representations of sequences at PRO-cap peaks overlapping promoters or distal enhancers. (B) Distributions of the number of identified motif instances in peaks overlapping either promoters or distal enhancers. (C) Fraction of PRO-cap peaks containing at least one instance of a motif, overall vs. in peaks overlapping either promoters or distal enhancers (** = p < 2e-4, Mann-Whitney U test). (D) Identified motif instance strengths (cosine similarity to the motif CWM) across PRO-cap peaks overlapping promoters and distal enhancers (* = p < 1e-2, ** = p < 1e-4, Mann-Whitney U test). (E) Counts task predictions made on held-out PRO-cap peaks by ProCapNet (y-axis) vs. a re-trained version ProCapNet that only saw promoter sequences during training.
