Skip to main content
. 2018 May 14;7:e34077. doi: 10.7554/eLife.34077

Figure 2. Predicted intra-TAD loop anchors share many properties of TAD anchors.

(A) Diagram illustrating intra-TAD loop prediction based on CTCF motif orientation and CTCF and cohesin (Rad21) ChIP-seq binding strength data. Iteration was conducted until 20,000 loops were predicted per sample, prior to filtering and intersection across samples, as detailed in Materials and methods. (B) Shown is a 2 Mb segment of mouse chromosome 2 indicating TAD loops (blue) and intra-TAD loops (pink) in relation to genes. Also shown are cohesin interaction loops identified experimentally in mouse ESC by Smc1 ChIA-PET (Dowen et al., 2014). (C) TAD and predicted intra-TAD loop anchors are more tissue ubiquitous than other categories of CTCF/CAC sites. Each of the four CTCF site subgroups was defined in mouse liver as detailed in Supplementary file 1C. The x-axis indicates the number of ENCODE tissues out of 15 tissues examined that also have CTCF bound, where a higher value indicates more tissue-ubiquitous CTCF binding. These data are shown for ‘lone’ CTCF binding sites (10,553), non-anchor cohesin-and-CTCF sites (‘Other CAC’; 26,970), TAD anchors (5,861), and intra-TAD loop anchors (9,052, which excludes those at a TAD loop anchor). While ‘Other CAC’ sites tend to be weaker (Figure 2F, below), 93% are bound by CTCF in at least one other mouse tissue, and 66% were verified in at least six other tissues. Similarly, for ‘Lone CTCF’, 81% of sites were bound by CTCF in at least one other mouse tissue, and 39% were verified in at least six other tissues (not shown). (D) TAD and intra-TAD loop anchors are more resistant to the knockdown effects of Rad21 ±haploinsufficency than other CAC sites or cohesin-bound regions. A larger fraction is also bound by the novel extrusion complex factor Top2b (Supplementary file 1C). (E) Loop anchors show greater intra-motif conservation than other CTCF-bound regions. Shown are the aggregate PhastCons score for oriented core motifs within either TAD (dark blue) or intra-TAD (light blue) anchors as compared to other CTCF peaks with motifs (yellow). (F) Cohesin interacts with the COOH terminus of CTCF (Xiao et al., 2011), which resulting in a shift of ~20 nt in cohesin ChIP signal relative to the CTCF summit (c.f. shift to the right of vertical red line) regardless of category of CTCF binding site (anchor/non-anchor). Blue arrows indicate the CTCF motif orientation and red triangles and vertical lines indicate position of the CTCF signal summit.

Figure 2.

Figure 2—figure supplement 1. Comparison of CTCF Features within TADs and Loop Prediction Improvements.

Figure 2—figure supplement 1.

(A) Fewer than 25% of CTCF, CAC, and tissue ubiquitous CTCF sites are within 25 kb of a TAD boundary. While these features are strongly enriched at TAD boundaries (Figure 1A), the vast majority of any subgroup of CTCF sites is still TAD-internal. (B) Comparison of features between the set of 9543 mouse liver intra-TAD loops predicted in this study and an alternative set of 60,677 mouse liver loops predicted using the method described previously (Oti et al., 2016), without any additional filtering. These ‘60 k loops’ tend to be shorter, show much less overlap with mESC ChIA-PET loops, and only capture 59% of intra-TAD loops, as shown. To determine properties of the anchors for the 60 k loop set, we considered a subset comprised of 25,983 unique alternative loop anchors (i.e. loop anchors that are not also anchors of intra-TAD loops or TADs; see Materials and methods). This ‘26 k loop anchor’ subset shows many fewer directional interactions and less insulation (median IBI and JSD, respectively; also see Figure 2—figure supplement 1C,D). (C) The set of 9052 intra-TAD loop anchors (Supplementary file 1B) shows greater insulation of repressive histone marks than the set of 25,983 other putative CTCF-mediated loops (‘26 k anchors’) using a prior iterative loop prediction method (Oti et al., 2016). CTCF peaks identified in the merged sample (combination of all biological replicates) were used as input for computational loop prediction exactly as described in (Oti et al., 2016), without consideration of cohesin strength/binding and without applying any additional filters. This loop list was then filtered to remove any loop that shares an anchor with an intra-TAD loop or whose anchor is within 50 kb of a TAD boundary. Shown here are the insulation scores (JSD) around intra-TAD loop anchors and other putative loop anchors (as defined in panel B) for H3K27me3 and H3K9me3 ChIP-seq read distribution, both of which show greater insulation around intra-TAD loop anchors. (D) Intra-TAD loop anchors show greater insulation of Hi-C data-based interactions as well as stronger directionality of interactions than other putative CTCF loops. The graphic for intra-TAD loop anchors is reproduced from Figure 3C. The 26 k loop anchors defined in panels B were split into left (upstream) and right (downstream) anchors based on CTCF motif orientation. See Figure 3C for further details. (E) 91% of predicted intra-TAD loops are wholly contained within a single TAD, even without filtering for TAD or TSS overlap. This compares to 67% for a random shuffled set of of 9543 regions of equal length and number to the set of intra-TAD loops.
Figure 2—figure supplement 2. Example Screenshots of Predicted intra-TAD Loops with Observable Interactions in CH12-LX (Mouse B-Cells).

Figure 2—figure supplement 2.

(A-C) Many of the intra-TAD loop structures that we predicted for mouse liver can be seen in the high resolution Hi-C data from the mouse B-cell lymphoma cell line CH12-LX (12). TADs are marked in each panel as horizontal red lines. Shown beneath each red line are the liver CTCF and cohesin ChIP-seq data used to predict the liver intra-TAD loops indicated. Panel F shows two examples of single intra-TAD loops within TADs. Panel G shows examples of nested intra-TAD loops, where one intra-TAD loop anchor is predicted to interact with more than one CAC anchor. Finally, Panel H shows more complex subdivision of TADs into multiple intra-TAD loops. The top section of each panel shows the Hi-C data from CH12-LX cells, while the lower section of each panel presents our data from mouse liver. Red arrowheads mark focal peaks in the contact matrix, which correspond to the midpoints of the predicted intra-TAD loops.
Figure 2—figure supplement 3. Subclasses of CTCF binding events in relation to predicted loops.

Figure 2—figure supplement 3.

(A) Summary of CTCF sites, CAC sites, TAD loop anchors, and intra-TAD loop anchors in mouse liver based on lists in Supplementary file 1C. (1) CTCF peaks found in at least two biological replicate samples (n = 52,436). (2) The subset of the above CTCF sites that also overlap a cohesin (Rad21) binding site (n = 42,801). (3) CTCF sites predicted to be involved in an intra-TAD loop (9,052) or a TAD loop (5,861). Due to some ambiguity and nesting of many loop structures, the number of intra-TAD loop anchors shown is substantially less than the total number of intra-TAD loops (9,543) multiplied by a factor of 2. (B) Cohesin appears to be the primary contributing factor for topoisomerase-IIβ (Top2b) interaction with CTCF, as Top2b is only present at 8.5% of CTCF sites lacking cohesin, but is found at 56% of cohesin sites lacking CTCF (i.e., CNC sites). (C) De novo motif discovery for loop anchors did not reveal any specific motifs that differentiate loop anchors from other CTCF-bound regions. De novo motif discovery was performed using Homer. In some cases, evidence of expanded CTCF motif usage was observed downstream of the core motif (region 3, loop interior), however, fewer than 5% of the genomic regions analyzed contained any of the specific motifs discovered. We found evidence for the M2 motif (region 2, loop exterior) in all groups, fitting with the small secondary peak of conservation just upstream of the core motif. The PhastCons intra-motif conservation figure duplicates that shown in Figure 2E. (D) Analysis of loop anchors for known motifs did not identify any specific motifs that differentiate loop anchors from other CTCF-bound regions that contain the core CTCF motif (MA0139.1). Some modest degree of sequence optimization may occur at loop anchors, as additional CTCF motifs were consistently present at a higher proportion of loop anchors than at other CTCF-bound regions. The motif for Znf143 and the specific ‘M2’ CTCF motif showed no differences and were found in ~20% of the genomic regions in each group. JASPAR IDs are indicated when available. Position weight matrices for ‘additional de novo CTCF motifs’ can be found at CTCFBSDB 2.0 (http://insulatordb.uthsc.edu/download/CTCFBSDB_PWM.mat).
Figure 2—figure supplement 4. Intra-TAD loop prediction in two other mouse cell types: mESC and NPC.

Figure 2—figure supplement 4.

(A) Comparison of CAC-mediated intra-TAD loop predicted for mouse embryonic stem cells (mESCs) and neural progenitor cells (NPCs) with those predicted in mouse liver. The number of loops present after merging predictions across all replicates is shown (column 1), followed by the number of loops that do not substantially overlap TADs (i.e., loops that show <80% reciprocal overlap with TADs; column 2). Column three shows the number of loops that contain ≥1 TSS (either protein-coding gene or multi-exonic long noncoding RNA TSS). Overlaps are presented for this set of filtered loops (column 3) with the sets of intra-TAD loops predicted in liver, Insulated Neighborhoods in mESC (Hnisz et al., 2016), and ‘CTCF-CTCF’ interactions from Smc1 ChIA-PET experiments in mESC (Dowen et al., 2014). As the size of each group is different, overlaps in columns 5 and 6 show the percent overlap relative to the group indicated in the column header (either liver intra-TAD loops or insulated neighborhoods) followed by parentheses showing the percent overlap relative to the row group (either liver, mESC, or NPC intra-TAD loops). (B) Shown is the overlap between mESC and NPC intra-TAD loops (63%), which is similar to the overlap between mESC and NPC TADs. this indicates that intra-TAD loops show a similar, or even somewhat higher, level of tissue ubiquity as do TADs. (C) Tissue-specific intra-TAD loops are weaker than those shared across liver, mESC, and NPCs. To compare the relative strength of loops predicted by our approach, we divided Smc1 ChIA-PET loops (Dowen et al., 2014) into five groups: ‘+++’, meaning ‘CTCF-CTCF’ interactions that overlap intra-TAD loops predicted in all three cell types; ‘++-‘, meaning ‘CTCF-CTCF’ interactions that overlap intra-TAD loops predicted in mESCs and only one other cell type; ‘+--‘, meaning ‘CTCF-CTCF’ interactions that overlap intra-TAD loops predicted in mESCs and no other cell type; ‘---‘, meaning ‘CTCF-CTCF’ interactions that do not overlap any predicted intra-TAD loop in mESC; or ‘CNC’, meaning cohesin-non-CTCF interactions or cohesin-mediated interactions that are not anchored by mESC CTCF binding (primarily enhancer-promoter interactions). There is no significant difference in strength of interaction between mESC-unique intra-TAD loops (+--) and those not predicted in our model (---). Both of these groups are still stronger than CNC-mediated interactions (‘+-- ‘or ‘--- ‘vs CNC), as measured by the number of PETs supporting these interactions.
Figure 2—figure supplement 5. Example Screenshots for intra-TAD Loops in mESC and NPC (Mouse).

Figure 2—figure supplement 5.

(A-C) Screenshots of intra-TAD loops predicted in mESC and NPC cells are shown below high-resolution Hi-C data for each cell type. These data provide support for both tissue-specific and tissue-ubiquitous intra-TAD loops. The same genomic region is shown on the left and on the right of each panel. (A) Intra-TAD loops shared between mESC and NPC cells on mouse chromosome 17. Shown are four shared intra-TADs loops, whose anchor-to-anchor interactions are apparent from the Hi-C data in both mESC and NPC cells (blue arrowheads). Three of the upstream loops are contained within a weaker TAD loop, which can be seen in both cell types (orange arrowhead). (B) Intra-TAD loop on chromosome one that is predicted only in NPCs. This data supports the model that a minority of intra-TAD loops are tissue-specific, as this genomic region shows an intra-TAD loop that was only predicted in NPCs, with an interaction only seen in NPC cells, but not in mESC cells. Contact matrix (green arrowhead). (C) Intra-TAD loops on a segment of chromosome 1, that are predicted only in mESCs or only in NPCs. Tissue-specific intra-TAD loops for mESCs and NPCs are observed, each of which is supported by a corresponding enrichment in their respective Hi-C contact matrix (green arrowhead). Also shown is a tissue-specific loop in NPCs that may represent some other type of looping event (i.e. enhancer-promoter; purple arrowhead).
Figure 2—figure supplement 6. Intra-TAD loop predictions in human cell lines GM12878 and K562.

Figure 2—figure supplement 6.

(A) Comparison of CAC-mediated intra-TAD loop predicted in human lymphoblastoid-derived cells (GM12878 cells) and in human chronic myelogenous leukemic cells (K562 cells). The number of loops present after merging predictions across all replicates is shown, followed by filtering to ensure that only loops containing a RefSeq TSS gene are retained. The overlap of each group with its respective sets of loop domains (LD; column 4) and contact domains (CD; column 5) is shown; the percent of intra-TAD loops that overlap loop domains or contact domains is listed first, followed by the percent of loop domains and contact domains that overlap an intra-TAD loop (values in parentheses). The percentage of intra-TAD loops that show CTCF ChIA-PET interactions in K562 cells is shown in the last column. (B) The percent of loop domains (left) and contact domains (middle) that are shared between GM12878 and K562 cells is consistently higher for intra-TAD loops predicted by our method (right) than for loop domains or contact domains. It should be noted that the GM12878 cell dataset was sequenced more deeply and with more replicates than the datasets for K562 cells, which is likely why GM12878 cells show ~50% more loop domains and contact domains than K562 cells. Even comparing the percent overlap relative to the smaller subset of loop domains/contact domains in K562 cells, we observe greater overlap between cell types for the intra-TAD loops predicted by our method. (C) K562-specific loops are significantly weaker than intra-TAD loops predicted in both K562 and GM12878 cells. To compare the relative strength of loops predicted by our method, we divided CTCF ChIA-PET loops from K562 cells (ENCODE: ENCSR436IAJ) into three groups: ‘++’, meaning CTCF ChIA-PET interactions that overlap with intra-TAD loops predicted in both K562 and GM12878 cells; ‘+-‘, meaning CTCF ChIA-PET interactions that overlap intra-TAD loops predicted in K562 cells but not in GM12878 cells; or ‘--', meaning CTCF ChIA-PET interactions that do not overlap any intra-TAD loop in K562 cells. Similar to the results for mouse (Figure 2—figure supplement 4C), the shared intra-TAD loops (++) show significantly higher interaction strength than either the K562-specific or other CTCF loops. Further, the K562-specific loops are stronger than other CTCF loops that do not overlap intra-TADs.
Figure 2—figure supplement 7. Example Screenshots for intra-TAD Loops in K562 and GM12878 (Human).

Figure 2—figure supplement 7.

(A-C) Screenshots of intra-TAD loops predicted in K562 and GM12878 cells are shown below high-resolution Hi-C data for each cell line. The same genomic region is shown on the left and on the right of each panel. Shown below each gene track are stacked H3K27ac ChIP-seq tracks for the tier 1 ENCODE cell lines, including K562 (dark blue) and GM12878 cells (red). (A) Intra-TAD loops shared between K562 and GM12878 cells on human chromosome 2. Shown are nested intra-TADs loops whose anchor-to-anchor interactions are apparent in both K562 and GM12878 cells (blue arrowheads). (B) Intra-TAD loop on human chromosome one that is predicted only in GM12878 cells. This supports the model that a minority of intra-TAD loops are tissue-specific, as this region shows an intra-TAD loop predicted only in GM12878 cells, and with an interaction only observed in the GM12878, but not the K562 cell contact matrix (green arrowhead). Other GM12878-specific interactions are observable within the intra-TAD loops between the promoter of MIR181A1HG and upstream GM12878-specific enhancers (purple arrowhead; red H3K27ac track). (C) Intra-TAD loops on human chromosome one that are either predicted only in K562 cells, or are shared between K562 and GM12878 cells. Shown upstream is a K562-specific nesting (green arrowhead) within a larger shared intra-TAD loop (blue arrowhead). Shown downstream are additional examples of tissue-specific intra-TAD loops present in K562 cells, which are supported by corresponding enrichments in the K562 Hi-C contact matrices.