Skip to main content
[Preprint]. 2024 Feb 15:2023.04.02.535219. Originally published 2023 Apr 4. [Version 3] doi: 10.1101/2023.04.02.535219

Figure. 6. Epigenetic comparisons of regulatory landscapes and cCREs.

Figure. 6.

(A and B) DNA sequence alignments and correlations of epigenetic states in human GATA1 and mouse Gata1 genes and flanking genes. (A) Dot-plot view of chained blastZ alignments by PipMaker (Schwartz et al. 2000) between genomic intervals encompassing and surrounding the human GATA1 (GRCh38 ChrX:48,760,001–48,836,000; 76kb) and mouse Gata1 (mm10 ChrX:7,919,401–8,020,800; 101.4kb, reverse complement of reference genome) genes. The axes are annotated with gene locations (GENCODE), predicted cis-regulatory elements (cCREs), and binding patterns for GATA1 and EP300 in erythroid cells. (B) Matrix of Pearson’s correlation values between epigenetic states (quantitative contributions of each epigenetic feature to the assigned state) across 15 cell types analogous for human and mouse. The correlation is shown for each 200bp bin in one species with all the bins in the other species, using a red-blue heat map to indicate the value of the correlation. Axes are annotated with genes and cCREs in each species. (C) Decomposition of the correlation matrix (panel B) into six component parts or factors using nonnegative matrix factorization. (D-G) Correlation matrices for genomic intervals encompassing GATA1/Gata1 and flanking genes, reconstructed using values from NMF factors. (D and E) Correlation matrices using values of NMF factor 3 between human and mouse (panel D) or within human and within mouse (panel E). The red rectangles highlight the positive regulatory patterns in the GATA1/Gata1 genes (labeled Px), which exhibit conservation of both DNA sequence and epigenetic state pattern. The orange rectangles denote the distal positive regulatory region present only in mouse (labeled D), which shows conservation of epigenetic state pattern without corresponding sequence conservation. Beneath the correlation matrices in panel E are maps of IDEAS epigenetic states across 15 cell types, followed by a graph of the score and peak calls for NMF factor 3 and annotation of cCREs (thin black rectangles) and genes. (F and G) Correlation matrices using values of NMF factor 6 between human and mouse (panel F) or within human and within mouse (panel G). The green rectangles highlight the correlation of epigenetic state patterns within the same gene, both across the two species and within each species individually, while the black rectangles highlight the high correlation observed between the two genes GATA1 and HDAC6.