Abstract
In this study, by exploring chromatin conformation capture data, we show that the nuclear segregation of Topologically Associated Domains (TADs) is contributed by DNA sequence composition. GC-peaks and valleys of TADs strongly influence interchromosomal interactions and chromatin 3D structure. To gain insight on the compositional and functional constraints associated with chromatin interactions and TADs formation, we analysed intra-TAD and intra-loop GC variations. This led to the identification of clear GC-gradients, along which, the density of genes, super-enhancers, transcriptional activity, and CTCF binding sites occupancy co-vary non-randomly. Further, the analysis of DNA base composition of nucleolar aggregates and nuclear speckles showed strong sequence-dependant effects. We conjecture that dynamic DNA binding affinity and flexibility underlay the emergence of chromatin condensates, their growth is likely promoted in mechanically soft regions (GC-rich) of the lowest chromatin and nucleosome densities. As a practical perspective, the strong linear association between sequence composition and interchromosomal contacts can help define consensus chromatin interactions, which in turn may be used to study alternative states of chromatin architecture.
Subject terms: Genetics, Computational biology and bioinformatics, Genetics, Computational biology and bioinformatics
Introduction
Recently developed chromatin conformation capture techniques and methods uncovered principles of the spatial organization of nuclear hubs and interchromosomal interactions. The discovery, characterization, and function of chromatin domains have been covered by a number of reviews1–5. These methods revealed many features of 3D genome organization, in particular, topologically associated domains (TADs)6,7, self-interacting regions, characterized by frequent within-chromatin interactions compared to relatively lower-frequency interactions with surrounding regions. They represent genomic architectural modules that constrain enhancer-promoter contacts, thereby setting tissue-specific interactions that regulate gene expression within TADs and connecting chromatin architecture with local gene expression2,8. TADs may contain smaller “sub-TADs”9,10 and, at smaller scale, may harbour individual “loops”9 or “insulation neighbourhoods”8,11.
The initial definition of TADs included implicitly Lamina Associated Domains (LADs)12. LADs occupy ~40% of the chromatin space, they are known to be AT-rich13 and recent work showed that mammalian interphase chromatin is a mosaic of different TADs (or inter-LADs) and LADs, broadly mapping to GC-rich and GC-poor chromosomal domains (isochores); constitutive LADs (cLADs) being the GC-poorest14. These investigations led to the following observations: (1) the match between isochores and TADs; (2) the evidence of complex structure and high CTCF binding sites of GC-rich TADs, compared to rather flat GC profiles of LADs; (3) the correspondence between GC valleys/peaks and chromatin loops (see Fig. 3 in14) (4) the qualitative assessment of preferential interchromosomal interactions among GC-rich TADs, where chromatin is open and negative super-coiling is more frequent15,16. Yet convincing quantitative evidence on the role of sequence composition in interchromosomal interactions is still scarce.
Within the cell nucleus, contacts between genomic regions associated with the nuclear lamina occur between domains that are intra-chromosomally close to each other17,18, with preferences for interactions among GC-poor regions for larger intra-chromosomal distance14. GC-rich, gene-rich regions show greater compositional heterogeneity and overall weaker intra-chromosomal interactions than loci in GC-poor, gene-poor regions. The intensity of such interactions however exhibits significant change from cell to cell, e.g., between growing and senescent cells19.
Lately, Quinodoz et al.20 developed a method called split-pool recognition of interactions by tag extension (SPRITE), which led to the discovery of two major hubs of interchromosomal interactions arranged around nuclear bodies: the nucleolus and the nuclear speckles. The authors concluded that inactive hub regions are much closer to the nucleolus and that 3D distance of DNA regions to these hubs is based on their functional properties, including the density of active Pol II within interacting genomic regions. Moreover, a large fraction of genomic regions showed preferential contacts with either hub; chromosomal regions that frequently contact the nucleolar hub were under-represented relative to the nuclear speckle hub, and vice versa (anti-correlated). Here we demonstrate that this anti-correlation is strongly associated with DNA sequence composition of the loci under consideration. We suggest that regional GC-peaks and valleys, together with the flat GC profile of LADs, contribute to the encoding of higher order interchromosomal hubs. To further explain the dependence of higher chromatin organization on sequence composition, we studied the compositional gradients within TADs and quantified the intra-TADs sequence composition and its co-variation with the density of genes, of CTCF binding sites and of super-enhancers (SE); the latter are able to drive higher levels of transcription than single/typical enhancers21,22. To better understand the role of transcription activity in nuclear molecular crowding, we estimated inter- and intra-TADs gene transcriptional profiles using 27 human tissues. Together, this analysis suggests that physicochemical and functional constraints affect chromatin loops formation and may induce phase separation through loop clusters interconnections. In such a model, multivalent macromolecular interactions23 are favourably occurring in GC-rich, nucleosome free chromatin24–26.
Results
Interchromosomal interactions are sequence composition dependent
To study the effect of regional genomic GC level on interchromosomal interactions, including hub formation, we first used the data provided by Quinodoz et al. to show that there is a strong GC enrichment of interchromosomal hubs arranged around the nuclear speckles (Fig. 1a). These associations (Pearson r = 0.82, p-values < 2.2e−16) are also observed locally along non-contiguous regions of mouse chromosome 11 reported in Quinodoz et al. (Fig. 1b). These results suggest that the preferential spatial arrangement of either the nucleolus or nuclear speckle hubs can be recognized by GC level changes along chromosomes.
Using a different method to explore interchromosomal interactions, Kalhor et al.27 devised tethered chromosome conformation capture (TCC) method and showed that specific clusters of functionally active loci are more likely to form interchromosomal contacts than inactive ones and that most of these contacts are a result of encounters between loci that are accessible to each other and have higher RNA polymerase II binding. Importantly, and in agreement with the SPRITE results, our analysis of genome-wide TCC data shows that interchromosomal interaction probability (ICP) is highly correlated (Pearson r = 0.62, p-value < 2.2e−16) with GC% of the interacting domains (Fig. 1c). This strong correlation is obvious when both ICP and GC% are visualized as profiles along chromosomes, as shown for human chromosome 7 (Fig. 1d); demonstrating clearly that interchromosomal contacts among GC-rich TADs are substantially high. This indicates that fitting a linear equation/regression to observed data is a good model for estimating either the expected interaction intensity or the expected GC content of the interacting DNA segments. Last, one can also notice the high interaction of centromeric regions, increased ICP values may be due to repetitive “satellite” DNA embedding centromeres and the frequent inter-centromeres clusters, thought to initiate the formation of nucleoli and nuclear radial position28.
The presence of GC-peaks and GC-valleys (Fig. 1b,d) along chromosomes appears to be a characteristic property of interchromosomal interactions. This prompted us to have a closer look at individual patterns of GC change within TADs or loops, and to quantify the intra-TAD distributions of key functional elements associated to these variations, namely genes, CTCF binding and super-enhancers densities across TADs.
Functional features of TADs and loops
GC-rich TADs exhibit higher frequency of loops or sub-TADs (Fig. 2e). The increase of GC level of TADs is also accompanied by increased gene density, CTCF binding, SE ovelap frequencies and transcription level (Fig. 2a–d). Open chromatin sub-compartments10 A1 and A2 are enriched in H2/H3-TADs (Fig. 2f), more compact B1-B3 sub-compartments are biased towards L1/L2-TADs (mainly LADs); B4 sub-compartment is a chromatin state that is specific to chr19, a chromosome composed mainly of GC-rich, gene-rich isochores and almost with no anchor to the lamina.
Intra-TADs compositional shape and chromatin loops heterogeneity
Having observed a very strong correlation between physical contacts and DNA sequence composition (GC%), we wanted to know how does sequence composition vary within TADs. Based on the GC profiles of size binned TADs and loops, and ignoring orientation (equivalence of 5′ or 3′ gradient patterns), we defined six possible classes of loops or TADs; A (increasing or decreasing GC%), B (bell shape or peak), C (valley shape), B− (half bell), C− (half valley) and D (uncorrelated). (for details see Methods section) These were found to be non-uniformly distributed across the human genome (Fig. S2). B class density is highest in GC-rich TADs and represents 45% of all classes in H2-H3 TADs, whereas C classes are most frequent in GC-poor TADs and represent 32% of all classes in L1-TADs. B− and C− are quasi-B and quasi-C TADs (See also legend to Fig. S1), their genome wide frequency distributions across the TADs GC range follow the same pattern as B and C TADs. D class (flat or spiky GC profile, −0.4 < r < 0.4) is homogenously (~25%) distributed (Fig. S2a). Notice that D class GC variation along L1 and L2-TADs are more homogenous than those from H2-H3, as one would expect from the positive correlation between average GC% and its standard deviation29. To see wether TADs/loops form clusters based on their internal variation of GC%, we performed unsupervised Principal Component Analysis (PCA) on the raw matrices having rows as binned GC values across individual TADs/loops and columns as binned TADs/loops length (see Methods for more details). Expectedly, PCA results showed that F1 values are highly correlated (r = 0.99) with the average GC% of TADs/loops. We thus used F2 and F3 principal components and could clearly identify separate clusters, B (bell shape) and B− (half bell shape) classes, on one hand and C (valley shape) and C− (half valley shape) classes on the other (Fig. S2b). This indicates that indeed, the variance of GC within loop domains (intra-loop) explains the positioning of individual loops in the (F2, F3) PCA plane (see Table S1 for genome wide proportions of all TADs and loops classes). In view of the nested structure of TADs, one of the sources of uncertainties associated TADs boundaries30, we re-estimated the relative contribution of each class of the TADs/loops after either increasing or decreasing their size by 50 kb on both 3′ and 5′ ends, then their genome wide frequencies were recalculated (Table S2). Interestingly, a majority of loops remained in C, B−, C− classes, these are in other words the least sensitive to boundaries definition, whereas a fraction of B-loops moved to the B− class. This again indicates that classes B− and C− behave like classes B and C, respectively; they together represent ~70% of the annotated TADs (see Table S1). In what follows, we will focus B and C classes, due to their sharp separation in the PCA and their shared GC-gradients with B− and C−.
Intra-TADs functional features
B and C-TADs (Fig. 3) exhibit different patterns of functional elements distributions; genes, CTCF binding and SE overlap are high at the GC-rich borders of the C-TADs. B-TADs show a less expected pattern, the gene density is highest at borders despite their relative GC-poorness compared to the TAD centres, and CTCF binding density is diffuse instead of peaking at the centre of the B-TADs, where GC% is the highest.
Expression data across 27 tissues showed that GC-rich TADs are enriched for highly expressed and housekeeping genes (Fig. S3), making them poised to contribute more to active hub regions, such as nuclear speckles (Fig. 1a,b). B− and C− show similar density patterns to those of B and C TADs and A-TADs and D-TADs behave similarly by exhibiting gene dense borders (Fig. S4). Enhancers frequency almost invariably follow the GC-gradient of TADs, in agreement with the overall (inter-TAD) trend observed in Fig. 2. Notably, C TADs exhibit a lower log10 [mean TPM] value compared to B TADs.
So far, the GC level of TADs was shown to correlate positively with other functional features, such as gene expression, gene density, SE overlap frequency and CTCF site occupancy. CTCF and SMC cohesin complex are associated with insulator function and are found at TAD boundaries6,7. Such an organization is most evident for C TADs, along which the frequency of CTCF binding, of genes and of SE are peaking at the chromatin domains borders. The characteristic shift of high density of CTCF binding sites in B-TADs (Fig. 2), fits with its GC-rich centre (CTCF binding sites are themselves GC-rich) and may in turn explain the propensity of B-TADs to harbour multiple loops, either nested or neighbouring each other, as suggested by higher loop density in GC-rich TADs (Fig. 2e). Incidentally, when we analysed mouse liver cells for which TADs and sub-TADs data were available31, compared to C-TADs, B-TADs showed a significant enrichment for these substructures (3.66 times enrichment, t-test p-value = 0.001), similar overrepresentation (3.59 times, t-test p-value = 0.001) of sub-TADs is also observed for B− TADs. The sub-TAD structure appears therefore, in part favoured by the local high density of CTCF binding sites within B-TADs.
B-TADs harbour relatively more housekeeping genes than C-TADs (Wilcoxon rank test, p-value = 0.036); the same trend can be observed for class B− compared to class C−, although with a Wilcoxon rank test p-value of 0.097 (Fig. S5). Interestingly, both housekeeping and tissue-specific gene densities are highest at TAD-borders, but the density of tissue specific genes can also be high in the middle of class B-TADs (Fig. 4). This result is in part in agreement with the observation that boundary regions are enriched for housekeeping genes6.
Discussion
Constrained interchromosomal interactions
Up to this point, we could show (1) a strong compositional anti-correlation between active and inactive hub regions, and the increased bias of contact probability index values towards GC-rich TADs or isochores (Fig. 1); (2) the existence of TADs/loops classes, supported by supervised and unsupervised analysis of their compositional profiles; (3) that the density distributions of these classes across the genome (Fig. S2) are non-uniform. C/C− class is biased towards GC-poor TADs, the latter will consequently tend to form nucleolar hubs, or interact at the nuclear envelope, to which AT-rich regions are regularly tethered. It is not clear if these silenced chromatin clusters are actively self-maintained or if the cell expression program primarily sets favourable nucleation around active hubs; in line with the second possibility, Pol II transcripts derived from intronic Alu elements (which are transcribed in the GC-rich nuclear interior) accumulate in nucleoli and were reported to be important for nucleolar integrity32. The inactive compartment of cLADs (GC-poorest TADs) is maintained by preferential sequestration in the nuclear envelope neighbourhood, while GC-rich TADs, transcriptionally active and mechanically flexible/softer33, may need active self-maintenance. Transcriptionally active TADs correspond to A1 + A2 open chromatin sub-compartments (Fig. 2f) and are generally located in the nuclear interior34,35, consistent with a less compact organization and an enrichment of long-range chromosomal contacts with other active TADs and potentially multi-TAD hubs36,37.
From a physicochemical point of view, interchromosomal interactions may recruit TADs and sub-TADs or loops with different compositional patterns (e.g. peak and valley) and consequently with different propensities to form nucleosomes and distinct abilities to bend and curve. In fact, GC content and dinucleotides frequencies may impose DNA structural/conformational constraints38,39; AT-rich tracts, AA/TT dinucleotide and AAA/TTT trinucleotide frequencies can rise the stiffness of the DNA fibre40 and GC tracts as well as the frequency of AAAA tetranucleotides can explain more that 50% of the variation in nucleosome occupancy41,42. Next to these DNA sequence factors, other histone marks are surrogates for transcriptional activity that can impact local chromatin structure.
More relevant to the large-scale GC variations, electro-kinetic DNA stretching43 showed that a quantifier of the stiffness of polymers, the persistence length of long DNA (>100 kb) has a remarkable dependence on the underlying sequence; rigid and unbent structures are AT-rich as opposed to GC-rich ones43. These differences and their possible consequences on genome folding are pictured in Fig. 5.
In the case of C-TADs endowed with CTCF, the crest mild flexibility and within C-TAD compositional design can be accommodated by single cohesin ring sliding over a preformed loop, through a reeling/extrusion mechanism44–48 or a handcuff cohesin rings, which according to the “handcuff model”49, could entrap single chromatin fibre (~10 nm diameter) connected at the loop base by a mediator. The high ATP cost associated with loop formation by extrusion50–52 is to be contrasted with recent observations that cohesion translocation occurs via diffusion, which does not require ATP52,53; hence, if any, this energetic burden is expected to be strongly reduced in CTCF-less loops.
The results presented here led to a model for the formation of LADs and TADs. LADs tend to be in a relatively unconstrained chromatin configuration, with an elongated shape and reduced flexibility compared to GC-rich gene-rich TADs54,55. The increase of AT-rich oligomers may reduce local DNA flexibility and bending of LADs, such sequence motifs may serve as points of nucleation for lamina and membrane proteins in less active chromatin domains, which generally harbour tissue-specific genes (Fig. S5). Depending whether a TAD or loop belongs to B or C-class, the initiation step of their formation may follow from local stiffness/bending of the DNA fibre, and nucleosome density, as early proposed for meiotic loops55,56. In line with this rationale, it was proposed57 that within TADs, nucleosome spacing, and DNA flexibility are higher in the middle than at the boundary of TADs. According to these authors, “attractive” forces within the chromatin domains can then confer specific local interactions, yielding a joint “insulation-attraction”. In contrast, this model did not consider sequence composition as a possible underlying cause14. In this regard, an explanation was put forward58, according to which TADs bending results from increasing GC% (and its correlates, oligo-Gs, CpGs and CpG islands, nucleosome spacing) that tips at the centre of loop (bell shape), this “moulding step” is followed by an “extruding step” that ends at the CTCF binding sites located at the base of the loops. This explanation is possible for a fraction (~10%, Table S1) of B-loops, namely those with GC% peaking at the centre and CTCF insulation at the base, but it does not apply to other TAD or loop classes. For instance, C class loops are five times more frequent than B class loops, their GC gradient is peaking at the borders, favouring CTCF binding and higher gene density (Fig. 3). In fact, independently of the GC gradients profiles, TADs/loops are able to self-interact and may form through the process of loop extrusion in the case of CTCF-cohesin loops, but CTCF-less loops need other mechanisms to account for their formation. Indeed, up to 62% of total identified CTCF-cohesin complexes are not associated with the anchor regions of a Hi-C loops, and 32% of TADs can form without an accompanying complex59. CTCF depletion60–63 reduced intra-TAD interactions and increased inter-TAD interactions, suggesting weakening, but not vanishing of TAD boundaries. In the absence of CTCF insulating factor, the presence of cohesin, mediator, or general transcription factors, may suffice for moderate chromatin folds insulation. Interestingly, CTCF-less loops consistently showed lower insulation of chromatin contacts31 and higher cohesin and TOP2B binding sites; TOP2B may facilitate supercoiling in a transcription-dependent manner64,65. This is in agreement with the observation that TOP2-mediated DNA fragility is linked to transcription and proximity to loop anchors66.
The expected lower bendability of C-TADs may underlie the fact that they are less dense in loops or sub-TADs compared to B-TADs. L1-TADs are expectedly more GC-homogenous than H2 + H3 TADs (Fig. S6), making the later more subject to local (within TAD/loop) bending and variable nucleosome density24 and supercoiling15.
A compositional phase separation model of chromatin hubs
It is known that sub-cellular liquid-like compartments are selectively permeable to macromolecules and can regulate biochemical reactions by concentrating enzymes and substrates67. As far as transcriptional activity is concerned, Shin et al. 2018 proposed that growing nuclear condensates/hubs tend to physically exclude chromatin leading to droplets formation. Along this line, we argue that GC-rich loops will favour transcriptional condensates (Fig. 5, bottom panel), possibly through nanoscale transcriptional assemblies at enhancer-rich and multi-gene clusters68,69.
The anticorrelated compositional profiles described for active and inactive hubs (Fig. 1) and the intra-TAD variations in enhancers density and gene transcription, are reminiscent of the existence of meta-stable chromatin interactions that involve cooperative interaction between enhancer components and DNA base composition. According to this hypothesis, active molecular assemblies over the nuclear space are biased towards GC-rich TADs and loops. GC-poor TADs, which include constitutive and facultative LADs, are likely to communicate at the vicinity of the nuclear envelope or to cluster in nucleolar bodies (Fig. 1). Indeed, LADs display a substantial overlap with nucleolus-associated chromatin domains70. These observations appear to fit a “compositional phase-separation” model where multivalency, i.e. the availability of many different binding sites on a polymer71, is crucial. In the case of double stranded DNA (dsDNA); base stacking interactions72,73,key elements of DNA structure, are sequence dependent and determine the DNA flexibility and its phase behaviour74. This phenomenon may be related to sequence-dependent persistence length and bendability of GC-tracts43, which, at a critical threshold may lead to secondary phase separation, giving rise to liquid-crystalline dsDNA sub-compartments within droplets75. Accordingly, small droplets can nucleate in both low (GC-rich) and high chromatin density regions (GC-poor) and their growth will be enhanced in mechanically soft regions (GC-rich) of the lowest chromatin and nucleosome densities33,76, hence pulling distal GC-rich regions of the genome into confined nuclear space, while excluding background chromatin33,77.The present model does not exclude mechanisms for local hubs formation other than phase separation, the compositional heterogeneity of mammalian genomes and the associated differential nucleosome density may suffice to trigger molecular crowding, in particular within CG-rich chromatin domains, likely recruiting denser transcription factorties78, due to their high gene density and transcriptional activity.
Finally, considering the compositional profiles of TADs, B and C classes do not only differ in their GC-gradients; B-TAD centres exhibit high overlap with SEs, while C-TADs exhibit high SE overlap at the borders (Fig. 3). If SE concentration and gene expression clusters contribute to the valency of interacting chromatin segments, increasing the number of the SE in GC-rich TADs (Fig. 5, bottom panel), will promote the formation of increasingly larger complexes that will emerge as phase separated macromolecular entities such as speckles. Of note, intrinsically disordered domains from Mediator, Brd4, Oct4 or other TFs, are expectedly contributing to this process68,69,77.
Conclusions
In summary, our results indicate that sequence composition is a key aspect of chromatin TADs and hub formations. Other large-scale correlates, such as gene density and protein-DNA binding affinities, also contribute to spatial organization and local concentrations around nuclear bodies. The initiation step for TAD or loop formation is under “compositional constraints”, essentially driven by local flexibility or stiffness of the coiled DNA fibre. In such a context, intrinsic properties of DNA sequence, bendability, and binding affinity of promoters and enhancers, may have a strong influence on TAD dynamics and the phase separation behaviour of chromatin. The formation of active chromatin assemblies is compositionally biased and may take place in both GC-rich and GC-poor chromosomal environments, but gains strength in mechanically soft regions (GC-rich), where DNA-protein foci coalesce via multivalent links. Interactions among and within chromatin domains can be viewed as part of a flexible “chromatin code”79 that can help in deciphering to what extent the non-coding space of contemporary genomes is “junk”80,81 or “polite”82.
Methods
Data sets
To study TADs and chromatin loops in human, coordinates from genome-wide chromatin interaction frequencies (Hi-C experiments) performed on human cell lines HMEC, HUVEC, IMR90, K562 and NHEK, were taken from Rao et al.10. Human and mouse genomic coordinates of Topologically Associated Domains (TADs) were taken from Dixon et al.6 and Pope et al.83, using comparative modENCODE/ENCODE (Encyclopedia of DNA Elements). Human and mouse isochores boundaries were adopted from Costantini et al.84. GC% variation was visualized using a colour map representing increasing GC% in the order (L1, L2, H1, H2, and H3 isochore families), deep blue (33–37%GC), light blue (37–41%GC), yellow (41–46%GC), orange (46–53%GC) and red (53–59%GC). These boundaries are applied to define L1-TADs, L2-TADs, H1-TADs, H2-TADs and H3-TADs. The human cell line data was converted to hg19 coordinates using UCSC liftOver when necessary.
Interchromosomal interactions data was from Quinodoz et al.20, the authors assigned genomic DNA to inactive/nucleolar or active/speckle hubs in mouse ES cells and human GM12878 cells at 1 Mb resolution. We also used a different set of interacting genomic intervals obtained by tethered chromosome conformation capture27, another method allowing the exploration of interchromosomal interactions. These authors calculated the Interchromosomal Contact Probability index (ICP), which is defined as the sum of interchromosomal contact frequencies divided by the sum of its inter- and intra-chromosomal contact frequencies. Therefore, ICP describes the propensity of a region to form interchromosomal contacts. This data is from GM12878 human lymphoblastoid cells.
Clustering and identification of classes
To estimate the intra-loop patterns of GC variation, we divided the loops into two equal halves to quantify GC% increment/decrement (Fig. S1). The first half includes bin 1 to 50 and the second half includes bin 51 to 100. Thus, GC gradient of each half was identified by measuring the slope of the correlation coefficient (r) between the bin GC% and relative distance in each half of the TAD or loop. For each cell type in this study, a GC matrix of dimension Nx100 was thus obtained where N indicates the number of TADs identified in the particular cell type as rows, the 100 columns indicate the TAD/loop bins. Positive, negative or close to 0 values of the slope, respectively reflect increasing, decreasing or uncorrelated GC% vs. TAD/loop normalized coordinates. Ignoring orientation (equivalence of 5′ or 3′ gradient patterns), we defined six possible classes of loops or TADs; A (increasing or decreasing GC%), B (bell shape or peak), C (valley shape), B− (half bell), C− (half valley) and D (uncorrelated). Because GC-poor (L1, L2), and GC-rich (H1, H2, H3) isochore families generally define TADs14, the two properties (base composition and folding) are combined: GC-poor TADs (L1-TADs and L2 TADs) and GC-rich-TADs (H1-TADs, H2-TADs, H3-TADs). B (Bell shape) and C (valley shape) naming refers to compositional gradients within TADs.
Performing an unsupervised classification on the binned GC% variations across TADS/loops allowed us to verify if the above defined classes can be grouped in separate clusters. For this, we applied Principal Component Analysis (PCA) on the GC matrix for all cell types in study, using the R package FactoMineR85,86. The PCA clusters were identified using the R package factoextra86. Factors (F1, F2 and F3) explaining the majority of the variance will be used for visualization of TADs/loops clusters.
Distribution of functional elements across TADs
To study functional aspects of TADs with respect to intra-TAD GC variation, genomic coordinates of protein coding genes were obtained from GENCODE87, and their expression levels in 27 tissues were collected from the GTEx portal; transcriptional activity is expressed as Transcripts Per Million (TPM) which is a normalization method for RNA-seq, it is read as “for every 1,000,000 RNA molecules in the RNA-seq sample, “n” came from this gene/transcript.”. Genomic coordinates of human super-enhancers were obtained from the database of super-enhancers in mouse and humans dbSUPER88, and of CTCF binding sites from CTCFBSDB 2.089.
We next quantified the overlap between genomic coordinates of genes and loops boundaries using bedtools90. Only those overlaps were considered when the gene coordinates did not extend beyond the borders of the TADs. An index scale from 0.0 to 1.0 was used to assign relative positions of genes with respect to the TAD unit length; values in the extreme ends of this scale, i.e. 0.0–0.2 and 0.8–1.0 mean that the gene is located close to the borders of the TAD. Values in the middle of this scale, i.e. 0.3–0.7 mean that the gene is located around the centre of the TAD. The same approach was followed to analyse the distribution of super-enhancers and CTCF binding sites across TADs.
Distributions of housekeeping and tissue-specific genes within TAD classes were identified using the tissue specificity index (Tau)91. Genes with Tau value less than 0.3 were considered housekeeping genes, those with Tau value greater than 0.8 were considered tissue-specific.
Supplementary information
Acknowledgements
We thank Sofia A. Quinodoz for sharing data on nucleolar interactions.
Author Contributions
K.J. and T.W. planed the work, K.J. and M.C. performed the analysis and prepared the figures, K.J. wrote the manuscript and all authors reviewed the manuscript.
Competing Interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary information accompanies this paper at 10.1038/s41598-019-51036-9.
References
- 1.Dekker J, Heard E. Structural and functional diversity of Topologically Associating Domains. FEBS Lett. 2015;589:2877–2884. doi: 10.1016/j.febslet.2015.08.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sexton T, Cavalli G. The role of chromosome domains in shaping the functional genome. Cell. 2015;160:1049–59. doi: 10.1016/j.cell.2015.02.040. [DOI] [PubMed] [Google Scholar]
- 3.Yu M, Ren B. The Three-Dimensional Organization of Mammalian Genomes. Annu Rev Cell Dev Biol. 2017;33:265–289. doi: 10.1146/annurev-cellbio-100616-060531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Rowley MJ, Corces VG. Organizational principles of 3D genome architecture. Nat Rev Genet. 2018;19:789–800. doi: 10.1038/s41576-018-0060-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.van Steensel B, Furlong EEM. The role of transcription in shaping the spatial organization of the genome. Nat Rev Mol Cell Biol. 2019;20:327–337. doi: 10.1038/s41580-019-0114-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Dixon JR, et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 2012;485:376–380. doi: 10.1038/nature11082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Nora EP, et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature. 2012;485:381–385. doi: 10.1038/nature11049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Dowen JM, et al. Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell. 2014;159:374–387. doi: 10.1016/j.cell.2014.09.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Phillips-Cremins JE, et al. Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell. 2013;153:1281–95. doi: 10.1016/j.cell.2013.04.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Rao SSP, et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014;159:1665–1680. doi: 10.1016/j.cell.2014.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ji X, et al. 3D Chromosome Regulatory Landscape of Human Pluripotent Cells. Cell. 2016;18:262–275. doi: 10.1016/j.stem.2015.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Guelen L, et al. Constitutive nuclear lamina-genome interactions are highly conserved and associated with A/T-rich sequence. Genome Res. 2013;2:270–80. doi: 10.1101/gr.141028.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Jabbari K, Bernardi G. An isochore framework underlies chromatin architecture. PLoS One. 2017;12:e0168023. doi: 10.1371/journal.pone.0168023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Naughton C, et al. Transcription forms and remodels supercoiling domains unfolding large-scale chromatin structures. Nat Struct Mol Biol. 2013;20:387–95. doi: 10.1038/nsmb.2509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Corless S, Gilbert N. Effects of DNA supercoiling on chromatin architecture. Biophys Rev. 2016;8:245–258. doi: 10.1007/s12551-016-0210-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Peric-Hupkes D, et al. Mol Cell. 2010;38:603–13. doi: 10.1016/j.molcel.2010.03.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.van Steensel B, Belmont AS. Lamina-Associated Domains: Links with Chromosome Architecture, Heterochromatin, and Gene Repression. Cell. 2017;169:780–791. doi: 10.1016/j.cell.2017.04.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chandra T, et al. Global reorganization of the nuclear landscape in senescent cells. Cell Rep. 2015;10:471–83. doi: 10.1016/j.celrep.2014.12.055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Quinodoz SA, et al. Higher-Order Interchromosomal Hubs Shape 3D Genome Organization in the Nucleus. Cell. 2018;174:744–757. doi: 10.1016/j.cell.2018.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Long HK, Prescott SL, Wysocka J. Ever-Changing Landscapes: Transcriptional Enhancers in Development and Evolution. Cell. 2016;167:1170–1187. doi: 10.1016/j.cell.2016.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Galupa R, Heard E. Topologically Associating Domains in Chromosome Architecture and Gene Regulatory Landscapes during Development, Disease, and Evolution. Cold Spring Harb Symp Quant Biol. 2017;82:267–278. doi: 10.1101/sqb.2017.82.035030. [DOI] [PubMed] [Google Scholar]
- 22.Banani SF, Lee HO, Hyman AA, Rosen MK. Biomolecular condensates: organizers of cellular biochemistry. Nat Rev Mol Cell Biol. 2017;18:285–298. doi: 10.1038/nrm.2017.7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Fenouil R, et al. CpG islands and GC content dictate nucleosome depletion in a transcription-independent manner at mammalian promoters. Genome Res. 2012;22:2399–2408. doi: 10.1101/gr.138776.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Struhl K, Segal E. Determinants of nucleosome positioning. Nature Struct and Mol Biol. 2013;20:267–273. doi: 10.1038/nsmb.2506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Drillon G, Audit B, Argoul F, Arneodo A. Evidence of selection for an accessible nucleosomal array in human. BMC Genomics. 2016;17:526. doi: 10.1186/s12864-016-2880-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Kalhor R, Tjong H, Jayathilaka N, Alber F, Chen L. Genome architectures revealed by tethered chromosome conformation capture and population-based modelling. Nat Biotechnol. 2011;30:90–98. doi: 10.1038/nbt.2057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Tjong H, et al. Population-based 3D genome structure analysis reveals driving forces in spatial genome organization. Proc Natl Acad Sci USA. 2016;113:E1663–1672. doi: 10.1073/pnas.1512577113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Clay O, Carels N, Douady C, Macaya G, Bernardi G. Compositional heterogeneity within and among isochores in mammalian genomes. I. CsCl and sequence analyses. Gene. 2001;276:15–24. doi: 10.1016/S0378-1119(01)00667-9. [DOI] [PubMed] [Google Scholar]
- 29.Zufferey M, Tavernari D, Oricchio E, Ciriello G. Comparison of computational methods for the identification of topologically associating domains. Genome Biol. 2018;19:217. doi: 10.1186/s13059-018-1596-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Matthews BJ, Waxman DJ. Computational prediction of CTCF/cohesin-based intra-TAD loops that insulate chromatin contacts and gene expression in mouse liver. Elife. 2018;7:e34077. doi: 10.7554/eLife.34077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Caudron-Herger M, et al. Alu element-containing RNAs maintain nucleolar structure and function. EMBO J. 2015;34:2758–2574. doi: 10.15252/embj.201591458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Shin Y, et al. Liquid Nuclear Condensates Mechanically Sense and Restructure the Genome. Cell. 2018;175:1481–1491.e13. doi: 10.1016/j.cell.2018.10.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Saccone S, Federico C, Bernardi G. Localization of the gene-richest and the gene-poorest isochores in the interphase nuclei of mammals and birds. Gene. 2002;300:169–78. doi: 10.1016/S0378-1119(02)01038-7. [DOI] [PubMed] [Google Scholar]
- 34.Cremer T, et al. The 4D nucleome: Evidence for a dynamic nuclear landscape based on co-aligned active and inactive nuclear compartments. FEBS Lett. 2015;589:2931–43. doi: 10.1016/j.febslet.2015.05.037. [DOI] [PubMed] [Google Scholar]
- 35.Yaffe E, Tanay A. Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat Genet. 2011;43:1059–1065. doi: 10.1038/ng.947. [DOI] [PubMed] [Google Scholar]
- 36.Olivares-Chauvet P, et al. Capturing pairwise and multi-way chromosomal conformations using chromosomal walks. Nature. 2016;540:296–300. doi: 10.1038/nature20158. [DOI] [PubMed] [Google Scholar]
- 37.Vinogradov AE. DNA helix: the importance of being GC-rich. Nucleic Acids Res. 2003;31:1838–44. doi: 10.1093/nar/gkg296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jabbari K, Bernardi G. Cytosine methylation and CpG, TpG (CpA) and TpA frequencies. Gene. 2004;333:143–149. doi: 10.1016/j.gene.2004.02.043. [DOI] [PubMed] [Google Scholar]
- 39.Li W, Miramontes P. Large-scale oscillation of structure-related DNA sequence features in human chromosome 21. Phys Rev E Stat Nonlin Soft Matter Phys. 2006;74:021912. doi: 10.1103/PhysRevE.74.021912. [DOI] [PubMed] [Google Scholar]
- 40.Peckham HE, et al. Nucleosome positioning signals in genomic DNA. Genome Res. 2007;17:1170–1177. doi: 10.1101/gr.6101007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Locke G, Tolkunov D, Moqtaderi Z, Struhl K, Morozov AV. High-throughput sequencing reveals a simple model of nucleosome energetics. Proc Natl Acad Sci USA. 2010;107:20998–1003. doi: 10.1073/pnas.1003838107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Chuang HM, Reifenberger JG, Cao H, Dorfman KD. Sequence-Dependent Persistence. Length of Long DNA. Phys Rev Lett. 2017;119:227802. doi: 10.1103/PhysRevLett.119.227802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Riggs AD. DNA methylation and late replication probably aid cell memory, and type I DNA reeling could aid chromosome folding and enhancer function. Philos Trans R Soc Lond Biol Sci. 1990;326:285–297. doi: 10.1098/rstb.1990.0012. [DOI] [PubMed] [Google Scholar]
- 44.Nasmyth K. Disseminating the genome: joining, resolving, and separating sister chromatids during mitosis and meiosis. Annu Rev Genet. 2001;35:673–745. doi: 10.1146/annurev.genet.35.102401.091334. [DOI] [PubMed] [Google Scholar]
- 45.Alipour E, Marko JF. Self-organization of domain structures by DNA-loop-extruding enzymes. Nucleic Acids Res. 2012;40:11202–11212. doi: 10.1093/nar/gks925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Goloborodko A, Marko JF, Mirny LA. Chromosome Compaction by Active Loop Extrusion. Biophys J. 2016;110:2162–2168. doi: 10.1016/j.bpj.2016.02.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Barrington C, Finn R, Hadjur S. Cohesin biology meets the loop extrusion model. Chromosome Res. 2017;25:51–60. doi: 10.1007/s10577-017-9550-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Terakawa T, et al. The condensing complex is a mechanochemical motor that translocates along DNA. Science. 2017;358:672–676. doi: 10.1126/science.aan6516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Nasmyth K. How are DNAs woven into chromosomes? Science. 2017;358:589–590. doi: 10.1126/science.aap8729. [DOI] [PubMed] [Google Scholar]
- 50.Vian L, et al. The Energetics and Physiological Impact of Cohesin Extrusion. Cell. 2018;175:292–294. doi: 10.1016/j.cell.2018.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Nishiyama T. Cohesion and cohesin-dependent chromatin organization. Curr Opin Cell. Biol. 2018;58:8–14. doi: 10.1016/j.ceb.2018.11.006. [DOI] [PubMed] [Google Scholar]
- 52.Ea V, et al. Distinct polymer physics principles govern chromatin dynamics in mouse and Drosophila topological domains. BMC Genomics. 2015;16:607. doi: 10.1186/s12864-015-1786-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Torres CM, et al. The linker histone H1.0 generates epigenetic and functional intratumor heterogeneity. Science. 2016;353:6307. doi: 10.1126/science.aaf1644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Kleckner N. Chiasma formation: chromatin/axis interplay and the role(s) of the synaptonemal complex. Chromosoma. 2006;115:175–194. doi: 10.1007/s00412-006-0055-7. [DOI] [PubMed] [Google Scholar]
- 55.Jabbari K, Wirtz J, Rauscher M, Wiehe T. A common genomic code for chromatin architecture and recombination landscape. PLoS One. 2019;14:e0213278. doi: 10.1371/journal.pone.0213278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Dixon JR, Gorkin DU, Ren B. Chromatin Domains: The Unit of Chromosome Organization. Mol Cell. 2016;62:668–80. doi: 10.1016/j.molcel.2016.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Bernardi G. The formation of chromatin domains involves a primary step based on the 3-D structure of DNA. Sci Rep. 2018;8:17821. doi: 10.1038/s41598-018-35851-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Zhang X, Branciamore S, Gogoshin G, Rodin AS, Riggs AD. Analysis of high-resolution 3D intrachromosomal interactions aided by Bayesian network modeling. Proc Natl Acad Sci USA. 2017;114:E10359–E10368. doi: 10.1073/pnas.1620425114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Zuin J, et al. Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc Natl Acad Sci USA. 2014;111:996–1001. doi: 10.1073/pnas.1317788111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Schwarzer W, et al. Two independent modes of chromatin organization revealed by cohesin removal. Nature. 2017;551:51–56. doi: 10.1038/nature24281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Rao SSP, et al. Cohesin Loss Eliminates All Loop Domains. Cell. 2017;171:305–320.e24. doi: 10.1016/j.cell.2017.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Nora EP, et al. Targeted Degradation of CTCF Decouples Local Insulation of Chromosome Domains from Genomic Compartmentalization. Cell. 2017;169:930–944.e22. doi: 10.1016/j.cell.2017.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Uusküla-Reimand L, et al. Topoisomerase II beta interacts with cohesin and CTCF at topological domain borders. Genome Biol. 2016;17:182. doi: 10.1186/s13059-016-1043-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Racko D, Benedetti F, Dorier J, Stasiak A. Are TADs supercoiled? Nucleic Acids Res. 2019;47:521–532. doi: 10.1093/nar/gky1091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Gothe HJ, et al. Spatial Chromosome Folding and Active Transcription Drive DNA Fragility and Formation of Oncogenic MLL Translocations. Mol Cell. 2019;75:267–283. doi: 10.1016/j.molcel.2019.05.015. [DOI] [PubMed] [Google Scholar]
- 66.Hyman AA, Weber CA, Jülicher F. Liquid-liquid phase separation in biology. Annu Rev Cell Dev Biol. 2014;30:39–58. doi: 10.1146/annurev-cellbio-100913-013325. [DOI] [PubMed] [Google Scholar]
- 67.Cho WK, et al. Mediator and RNA polymerase II clusters associate in transcription-dependent condensates. Science. 2018;361:412–415. doi: 10.1126/science.aar4199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Sabari BR, et al. Coactivator condensation at super-enhancers links phase separation and gene control. Science. 2018;361:6400. doi: 10.1126/science.aar3958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Németh A, et al. Initial genomics of the human nucleolus. PLoS Genet. 2010;6:e1000889. doi: 10.1371/journal.pgen.1000889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Li P, et al. Phase transitions in the assembly of multivalent signalling proteins. Nature. 2012;483:336–40. doi: 10.1038/nature10879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.McIntosh DB, Duggan G, Gouil Q, Saleh OA. Sequence-dependent elasticity and electrostatics of single-stranded DNA: signatures of base-stacking. Biophys J. 2014;106:659–666. doi: 10.1016/j.bpj.2013.12.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Shakya A, King JT. DNA Local-Flexibility-Dependent Assembly of Phase-Separated Liquid Droplets. Biophys J. 2018;115:1840–1847. doi: 10.1016/j.bpj.2018.09.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Peters JP, 3rd, Maher LJ. DNA curvature and flexibility in vitro and in vivo. Q Rev Biophys. 2010;43:23–63. doi: 10.1017/S0033583510000077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Ricci MA, Manzo C, García-Parajo MF, Lakadamyali M, Cosma MP. Chromatin fibers are formed by heterogeneous groups of nucleosomes in vivo. Cell. 2015;160:1145–1158. doi: 10.1016/j.cell.2015.01.054. [DOI] [PubMed] [Google Scholar]
- 75.Boija A, et al. Transcription Factors Activate Genes through the Phase-Separation Capacity of Their Activation Domains. Cell. 2018;175:1842–1855. doi: 10.1016/j.cell.2018.10.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Hnisz D, Shrinivas K, Young RA, Chakraborty AK, Sharp PA. A Phase Separation Model for Transcriptional Control. Cell. 2017;23:13–23. doi: 10.1016/j.cell.2017.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Zirkel A, Papantonis A. Transcription as a force partitioning the eukaryotic genome. Biol Chem. 2014;395:1301–5. doi: 10.1515/hsz-2014-0196. [DOI] [PubMed] [Google Scholar]
- 78.Trifonov EN. The multiple codes of nucleotide sequences. Bull Math Biol. 1989;5:417–32. doi: 10.1007/BF02460081. [DOI] [PubMed] [Google Scholar]
- 79.Graur D. An Upper Limit on the Functional Fraction of the Human Genome. Genome Biol Evol. 2017;9:1880–1885. doi: 10.1093/gbe/evx121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Doolittle WF, Brunet TDP. On causal roles and selected effects: our genome is mostly junk. BMC Biol. 2017;15:116. doi: 10.1186/s12915-017-0460-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Zuckerkandl E. Polite DNA: functional density and functional compatibility in genomes. J Mol Evol. 1986;24:12–27. doi: 10.1007/BF02099947. [DOI] [PubMed] [Google Scholar]
- 82.Pope BD, et al. Topologically associating domains are stable units of replication-timing regulation. Nature. 2014;515:402–405. doi: 10.1038/nature13986. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Costantini M, Clay O, Auletta F, Bernardi G. An isochore map of human chromosomes. Genome Res. 2006;16:536–541. doi: 10.1101/gr.4910606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Le S, Josse J, Husson F. FactoMineR: An R Package for Multivariate Analysis. Journal of Statistical Software. 2008;25:1–18. doi: 10.18637/jss.v025.i01. [DOI] [Google Scholar]
- 85.R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
- 86.Kassambara, A. & Mundt, F. Factoextra: Extract and Visualize the Results of Multivariate Data Analyses. R package version 1.0.5 (2017).
- 87.Harrow J, et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 2012;22:1760–1774. doi: 10.1101/gr.135350.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Khan A, Zhang X. dbSUPER: a database of super-enhancers in mouse and human genome. Nucleic Acids Research. 2016;44:164–171. doi: 10.1093/nar/gkv1002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Ziebarth J, Bhattacharya A, Cui Y. CTCFBSDB 2.0: a database for CTCF-binding sites and genome organization. Nucleic Acids Research. 2013;41:188–194. doi: 10.1093/nar/gks1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Kryuchkova-Mostacci N, Robinson-Rechavi M. A benchmark of gene expression tissue-specificity metrics. Brief. Bioinform. 2017;18:205–214. doi: 10.1093/bib/bbw008. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.