Figure 2.
Grammatical patterns capture regulatory conservation better than positional conservation. (A and B) Venn diagrams show the segmentation of regulatory space into 15 possible grammatical and positional classes. The first letter of each cell type was used to construct a class label for each cell in the diagrams. These labels describe the cell-specificity of the corresponding grammatical patterns and positionally conserved loci. (A) Grammatical classes represent collections of grammatical patterns that share the same observed cell specificity. Each segment in the Venn diagram is labeled with its grammatical class, the number of grammatical patterns assigned to the class (first number), and the total number of CRMs contributing to those patterns (number in parentheses). Overall, the SOM partitions the dataset into 780 grammatical patterns, 593 of which are used in both species and 187 that are species-specific (103 mouse and 84 human). (B) Positional classes describe regulatory conservation in terms of shared sequence occupancy, or positional conservation. Regulatory loci were assigned to positional classes based on the cell(s) in which we observed TF occupancy, regardless of the specific TFs present. Each segment in the Venn diagram is labeled with its positional class, the total number of loci within the class (first number), and the total number of CRMs assigned to the class (number in parentheses). In all, 209 768 CRMs were observed at 128 380 distinct loci, 54 491 of which are positionally conserved. (C and D) Pie charts show the fraction of CRMs in each cell type assigned to seven aggregate grammatical classes (C) or positional classes (D). The regulatory landscape appears highly conserved when defined as a set of grammatical patterns, with the bulk of CRMs in all cell types falling into grammatical patterns shared between human and mouse (C). By contrast, the vast majority of regulatory loci are not positionally conserved across human and mouse (D)—i.e. the orthologous locus in the other species is not bound by any TFs.