Skip to main content
. Author manuscript; available in PMC: 2021 Aug 1.
Published in final edited form as: Nat Mach Intell. 2021 Jan 18;3(2):172–180. doi: 10.1038/s42256-020-00282-y

Extended Data Fig. 4. Context sequence features specific to proximal vs. distal sites.

Extended Data Fig. 4

(a) Enrichment of 5-mers in high-scoring Grad-CAM regions for proximal (left) and distal (right) binding sites. Proximal and distal TF binding sites are defined as described in Online Methods. Rows and columns are ordered the same as in Fig. 3. (b-c) are the same as in (a) but show data for GC-controlled (b) and DNaseI-controlled (c) models. For (a-c), colors denote odds ratios and the sizes of the boxes denote statistical significance as in Fig. 3. (d) Comparison of top scoring 5-mers in proximal vs. distal SP1 sites. Bars show the odds ratio of enrichment of each sequence in top 5-mers for all (gray), proximal (red) and distal (blue) SP1 sites. The top 20 5-mers ranked by the best odds ratio across all three SP1 models (all, proximal, and distal sites) are shown. Error bars show 95% confidence intervals on odds ratios. (e-f) are the same as in (d) but show data for GC-controlled (e) and DNaseI-controlled (f) models.