Abstract
CXXC5 is a member of the zinc-finger CXXC family that binds to unmethylated CpG dinucleotides. CXXC5 modulates gene expressions resulting in diverse cellular events mediated by distinct signaling pathways. However, the mechanism responsible for CXXC5 expression remains largely unknown. We found here that of the 14 annotated CXXC5 transcripts with distinct 5′ untranslated regions encoding the same protein, transcript variant 2 with the highest expression level among variants represents the main transcript in cell models. The DNA segment in and at the immediate 5′-sequences of the first exon of variant 2 contains a core promoter within which multiple transcription start sites are present. Residing in a region with high G–C nucleotide content and CpG repeats, the core promoter is unmethylated, deficient in nucleosomes, and associated with active RNA polymerase-II. These findings suggest that a CpG island promoter drives CXXC5 expression. Promoter pull-down revealed the association of various transcription factors (TFs) and transcription co-regulatory proteins, as well as proteins involved in histone/chromatin, DNA, and RNA processing with the core promoter. Of the TFs, we verified that ELF1 and MAZ contribute to CXXC5 expression. Moreover, the first exon of variant 2 may contain a G-quadruplex forming region that could modulate CXXC5 expression.
Subject terms: Cell biology, Molecular biology
Introduction
DNA methylation is one of the mechanisms of gene silencing and primarily occurs in CpG dinucleotides of the genome. Methylation of cytosine residues results in the recruitment of methyl-CpG-binding proteins (MBPs) that act as transcription repressors1. Although the majority of CpGs in mammalian genomic DNA is methylated2, about 70% of human gene promoters are associated with unmethylated DNA sequences called CpG islands (CGIs)3,4. CGIs, which are rich in C and G nucleotides and defined by a high density of CpG dinucleotides, are often refractory to methylation and characterized with a chromatin state permissive for transcription. Recent studies indicate that deregulation of the tissue-specific methylation state of different classes of CGI promoters could contribute to the initiation/progression of cancer5,6. The establishment and maintenance of CGI-specific chromatin conditions are mediated by structurally and functionally distinct zinc-finger (ZF)-CXXC family proteins. The ZF-CXXC proteins preferentially interact with unmethylated CpG dinucleotides through a highly conserved ZF-CXXC domain characterized by two consecutive cysteine-rich motifs (CXXCXXC) tetrahedrally coordinated with Zn2+ ions forming zinc-finger structures. Upon binding to DNA, the ZF-CXXC proteins establish a chromatin architecture directly through chromatin-modifying enzymatic activities and/or indirectly through the recruitment of chromatin-modifiers7–9.
CXXC5, also known as RINF and WID, is a member of the ZF-CXXC family8,10. The CXXC5 gene located on chromosome 5q31.2 is ubiquitously expressed, albeit at varying levels, in human tissues8,10. Evidence indicates that morphogenic retinoic acid11, multifunctional cytokine family member transforming growth factor-β12, bone morphogenetic protein 413,14, the Wnt family of secreted glycolipoprotein Wnt3a15–17 or estrogen18–20 alters the CXXC5 expression as the primary response gene, whose protein product subsequently leads to changes in cell type-specific secondary gene expressions13,17,21–26. These changes are manifested as the modulation of cellular metabolism, proliferation, differentiation, or death in developmental processes and tissue maintenance11–13,15,21,23,24,26–31. Consistent with the functional importance of CXXC5 in physiology, de-regulated expressions of CXXC5 have been reported to correlate with the development of various pathologies including acute myeloid leukemia (AML), gastric, prostate, and breast cancer11,32–38.
Despite the involvement of CXXC5 in diverse cellular events mediated by distinct signaling pathways, the mechanism responsible for the expression of the CXXC5 gene remains largely unknown. Spatio-temporal control of gene expression is achieved at the transcriptional level. This requires the integrated effects of sequence-specific trans-factors, general transcription regulators, and cis-acting DNA regulatory elements including promoters, promoter-proximal elements, distance-independent elements, locus control regions, and insulator within a highly dynamic chromatin environment39,40. Nevertheless, promoters as diverse and complex architectural DNA segments primarily located adjacent to the transcriptional start sites (TSSs) of genes constitute the key platform for the assembly of pre-initiation complexes to mediate transcription39,40. Delineation of promoter features of the CXXC5 gene is essential for understanding the mechanisms of the CXXC5 gene regulation in a signal pathway- and cell type-dependent manner that could underlie its role in physiology and pathophysiology.
We found here that of the 14 annotated CXXC5 transcripts, transcript variant 2, which is composed of Exon3, 10, and 11 of CXXC5 and has the highest expression level among transcript variants, represents the main CXXC5 transcript in cell models. We also identified a DNA segment in and at the 5′ sequences of the first exon of transcript variant 2 (Exon3) as the core promoter region. Based on DNA sequence composition and motifs, chromatin configuration of as well as the presence of multiple TSSs together with an active RNA polymerase II at the core promoter, we suggest that a CGI promoter drives the expression of CXXC5. A promoter pull-down approach revealed the potential association of various transcription factors (TFs)/co-regulatory proteins, as well as of proteins involved in histone/chromatin, DNA, and RNA processing with the core promoter. We found here that of the transcription factors, ELF1 and MAZ contribute to the CXXC5 expression. Moreover, the DNA sequence within the first exon of transcript variant 2 was found to form a G-quadruplex (G4) structure in vitro that could modulate the CXXC5 expression in cellula.
Results
Genes can have multiple transcript variants encoding the same protein or different variants of a protein, and/or non-coding transcript variants as a result of alternative promoter usage and/or alternative splicing41–43. Alternative splicing of a single gene can generate a repertoire of protein isoforms with distinct features. Alternative promoters of a gene could have different tissue specificity, developmental activity, expression levels, and they may produce protein isoforms with distinct amino-termini41–43. The CXXC5 gene located on the long arm of chromosome 5, in the 5q31.2 region, is approximately 35 kb in length according to the human genome assembly GRCh38. The gene contains 11 exons and 10 introns. Transcript annotations by databases including Ensembl (https://www.ensembl.org/index.html), NCBI (https://www.ncbi.nlm.nih.gov) indicate that the CXXC5 gene can generate 14 transcript variants in different tissues (Fig. 1a,b), identified by the Expressed Sequence Tags (ESTs) approach widely utilized to identify alternative splicing products of genes44,45. The last two exons of the gene are found in all 14 transcript variants and contain the coding region of 969 nucleotides (nt), 924 bp of which is in Exons10, and 45 bp including a stop codon is in Exon11. All 14 transcript variants of CXXC5 encode the same protein with a calculated molecular mass (MM) of 33 kDa.
Since annotation of promoters in the human genome relies on the experimental evidence of 5′-ends of mRNA transcripts, which primarily correspond to the transcription start sites46,47, we predicted that the identification of TSS(s) of the main transcript variant(s) as the quantitatively most abundant one(s) could be used to define the promoter region(s) of CXXC5. The expression of CXXC5 as a retinoid-inducible nuclear factor critical for retinoid-induced cellular differentiation was first reported in cell models, including HL60 cells, derived from leukemia11,48. Similarly, we showed using breast adenocarcinoma-derived cell lines, including MCF7 cells, that CXXC5 is an E2-ER responsive gene18,19 and CXXC5 as an unmethylated CpG binder contributes to E2-mediated gene expressions critical for cellular proliferation29. We, therefore, explored the presence and the extent of transcript variant expressions of CXXC5 in MCF7 cells and HL60 cells as cell models. To identify transcript variants of CXXC5 expressed in MCF7 and HL60 cells, cDNA libraries generated from total RNA were subjected to multiple nested rounds of PCR using progressively nested primers specific to each variant together with a primer specific to Exon10, which is common to all CXXC5 transcript variants, followed by cloning and sequencing of the PCR amplicons (Supplementary Information, Table 1). Results revealed that CXXC5 generates transcript variants 1–3, 5–8, and 10 in MCF7 cells; whereas transcript variants 1–8 and 10 were detectable in HL60 cells (Fig. 1c). These findings suggest that the CXXC5 gene, except for transcript variant 4 which is present in only HL60 cells, generates the same transcript variants in both cell lines.
Our GTEx Portal analyses (www.gtexportal.org) indicate that although transcript variants of CXXC5 are expressed at varying levels in distinct tissues, transcript variant 2 and transcript variant 3 in breast tissue and transcript variant 2 in blood show the highest expression levels (Fig. 1d). Consistent with these, our qPCR results with the cDNA libraries used in the detection of transcript variants revealed that CXXC5-transcript variant 2 (NCBI Accession: NM_016463, Version: NM_016463.9), which contains Exons 3, 10, and 11 is the transcript with the highest relative expression in both MCF7 and HL60 cells (Fig. 1e).
Based on these results, we carried out Northern Blot (NB) analyses using ribosomal-RNA depleted RNA samples from MCF7 or HL60 cells. The length of the annotated human CXXC5 transcript variants varies between 2163 and 2695 (retrieved on March 2021, https://www.ncbi.nlm.nih.gov/datasets/tables/genes/?table_type=transcripts&key=37c49fee33e03d725b813c4cda693206) nt. We used a biotinylated probe complementary to the joint boundaries of Exon10 and Exon11 (343 nt in length) which are present in all transcript variants, and a biotinylated probe targeting Exon3 sequences (390 nt). We also used a GAPDH probe (589 nt) targeting Exon5-8 to detect the GAPDH transcript of 1525 nt in length as the control (NCBI Accession: NM_001289746, Version: NM_001289746.2). Results revealed that both probes specific to CXXC5 detect primarily a CXXC5 transcript with an electrophoretic migration of approximately 2500 nt in length similar to that of the annotated transcript variant 2, 2601 nt, while, as expected, a single GAPDH transcript of about 1500 nt was detected (Fig. 1f).
Because of these findings, we predicted that the promoter of CXXC5 resides in a transcript variant 2 region that encompasses a transcription start site (TSS). For the identification of TSS(s), we used the 5’ Rapid Amplification of cDNA Ends (5’RACE) approach, designed for the amplification of nucleic acid sequences from a messenger RNA (mRNA) template between a defined internal site and unknown sequences at the 5′-end of the mRNA through the use of an adaptor RNA probe49. Although prone to biases introduced by various factors including RNA secondary structures, G–C nucleotide content, adaptor ligation efficiency50, 5′RACE has been successfully used for the identification of 5′-ends of numerous RNA transcripts51. We also used TFF1, a well-studied estrogen-responsive gene19,52, as a control for 5′RACE studies. 3′RACE was also used for the identification of 3′-transcript sequences of both CXXC5 and TFF1 transcripts.
The 3’RACE approach readily identified the 3′ end of the CXXC5 or TFF1 transcript (Supplementary Information, Fig. S1). 5’RACE of CXXC5, in contrast to that of TFF1 which generates a transcript with a single TSS53 (Supplementary Information, Fig. S1), proved to be difficult likely due to the high GC content (> 70%) of the Exon 3 and surrounding sequences. Nevertheless, our results based on the sequencing of PCR amplicons generated from cDNA libraries of MCF7 cells indicated that several 5′-ends of transcript variant 2 can be detected, suggestive of multiple TSSs (Fig. 2a). These results imply the presence of a transcription start region for transcript variant 2 rather than a distinct TSS.
Collectively, our results indicate that transcript variant 2 of CXXC5 composed of Exons 3, 10, and 11 is the main transcript in MCF7 and HL60 cells.
The CXXC5 promoter is located in a DNA segment encompassing the beginning of Exon3
Based on the similar results obtained in MCF7 and HL60 cells, we assessed the promoter activity of the putative promoter region of CXXC5 by generating a PCR amplicon of 1975 bp in length from MCF7 genomic DNA that includes 5′ upstream regions of Exon3, the entire Exon3, and Exon4 (Fig. 2b). The PCR amplicon was inserted into a reporter vector, pGL3-Basic, which has no promoter but bears the Firefly Luciferase cDNA as the reporter enzyme. We also used a reporter vector bearing the estrogen-responsive TFF1 gene promoter19 as control. We found in transiently transfected MCF7 cells that the reporter enzyme activity from the putative CXXC5 promoter region, as from the TFF1 promoter, was significantly higher compared to that observed with the reporter vector bearing no promoter, from which the enzyme activity is set to one (Fig. 2c). To decipher the core promoter elements of the putative CXXC5 promoter, we generated sequential truncations at the 5′- or 3′-end of the region by PCR and inserted them into the reporter vector. Results from transiently transfected MCF7 cells indicated that DNA sequences of Exon3 produce the highest reporter activity (Fig. 2c). Further truncations and/or internal deletions as Segments (A–D) of Exon3 revealed that Segment A, corresponding to, and including the 5′ surrounding sequences of, Exon3 (Supplementary Information, Fig. S2) retains the promoter activity (Fig. 2d,e). Interestingly, Segment C alone suppresses (Fig. 2e), and in the presence of other segments lessens (Fig. 2d) the activity of the reporter enzyme. This suggests that Segment C alone includes DNA elements adversely affecting transcription. In keeping with this prediction, the genetic fusion of Segment C, to the 3′-end sequences of the TFF1 promoter or of the strong human cytomegalovirus (CMV) promoter effectively repressed the Luciferase enzyme activity (Fig. 2f) in contrast to Segment D which has minimal effects on the reporter activity induced by the CMV promoter.
These results suggest that the core promoter elements of CXXC5 reside in Segment A.
The methylation state of the putative CXXC5 promoter region
Based on the conclusion that transcript variant 2 is the main CXXC5 transcript in both MCF7 and HL60 cells, we initially carried out in silico analyses of a genomic region, about 1500 bp in length, of the CXXC5 locus, wherein Exon3 is situated, as the putative promoter region (Fig. 3a). The nucleotide sequence of the region revealed (1) a remarkably high (> 70%) G–C content (https://www.biologicscorp.com/tools/GCContent/), (2) a greatly enriched CpG dinucleotide repeats (https://www.biologicscorp.com/tools/GCContent/), (3) an asymmetric GC distribution, GC skew, which is used as a measure of DNA strand asymmetry in the GC nucleotide distribution (http://genskew.csb.univie.ac.at/GenSkewServlet) as a property of CpG islands54, and (4) the presence of a CpG island (CGI) (EMBOSS Cpgplot; https://www.ebi.ac.uk/). These analyses suggest that the transcription start region, including Segment A, of the transcript variant 2 is located in a CpG island, a conclusion consistent with the CGI annotation track of the CXXC5 locus in the human genome (https://genome.ucsc.edu/).
The methylation of mammalian genomic DNA shows variations across cell types, developmental stages, physiological and pathophysiological conditions55. Acting as stable and heritable epigenetic marks, methylated CpGs are present in 80% of CpGs in the genome and involve both genic and intergenic regions2. Although the majority of CpGs are methylated, about 70% of human gene promoters are associated with unmethylated CGIs3,4. CGIs are short (200–2000 bp) DNA segments that display high G-C content with enriched CpG dinucleotide repeats3,4,56. CGI promoters, often define promoters of housekeeping, developmental and tissue‐specific genes, show a transcriptionally permissive state, within which transcription initiation can occur at several closely spaced locations3,4,56. To examine the methylation state of Exon3 and the surrounding region including the putative CXXC5 promoter segment, we explored a targeted methylation profile of the region as well as Exon10 as a control for the methylated gene body of the CXXC5 locus using bisulfite-sequencing. Genomic DNA of MCF7 cells was subjected to bisulfite reaction to convert unmethylated cytosine residues to uracil followed by bisulfite PCR. PCR amplicons generated with bisulfite primers were cloned and sequenced. Sequences were then aligned to the genomic sequence of the corresponding CXXC5 regions using QUMA57 (http://quma.cdb.riken.jp/). Results from MCF7 (Fig. 3b) as well as HL60 cells (Supplementary Information Fig. S3) indicated that the 5′-upstream region of Exon3 shows a high degree of CpG methylation, which declines precipitously thereafter and remains largely unmethylated throughout the region including Exon3 (Fig. 3b) and Exon4 (data not shown). This contrasts with Exon10 which is highly methylated (Fig. 3b).
Common with all eukaryotic promoters, unmethylated CGI promoters also possess a nucleosome-free region surrounding TSSs58 and contain dispersed nucleosomes decorated with H3K4me3, which marks active transcription59–61. To assess the nucleosome occupancy at the DNA region including the putative CXXC5 promoter elements, MCF7 cells were fixed, permeabilized, and subjected to Micrococcal Nuclease (MNase) for chromatin digestion (Fig. 3c). DNA was subsequently purified and analyzed for digestion patterns with agarose gel electrophoresis. DNA fragments corresponding to tri-nucleosomal and mono-nucleosomal DNA were excised from the gel and purified. The fragmented DNA, or the uncut genomic DNA of MCF7 cells as control, was used as the template for PCR to assess the presence of nucleosomes at Exon3. For initial analyses, five overlapping regions (depicted as T1-5, Fig. 3d) were subjected to PCR using the tri-nucleosomal DNA template with the region-specific primer pairs. For further verification, three sub-regions of Exon3 (depicted as M1-3, Fig. 3d) were also analyzed using the mono-nucleosomal DNA. The detection of a PCR amplicon from fragmented DNA compared to genomic DNA suggests the presence of nucleosomes. Results with tri- (Fig. 3e) or mono-nucleosomal (Fig. 3f) DNA template revealed that the 5′-surrounding sequences of Exon3 and Segment A are primarily nucleosome-deficient and the remaining segments of Exon3 contain nucleosomes. To verify this finding, we carried out ChIP of Exon3 (Fig. 3g,h). MCF7 cells processed for chromatin digestion by the use of MNase, as described for nucleosome occupancy, were subjected to ChIP using an antibody specific to H3 (Fig. 3g) or tri-methylated histone H3 lysine 4, H3K4me3, (Fig. 3h), a histone modification used as a marker for actively transcribed genes60. Purified DNA was then subjected to qPCR using primers specific to Segments of Exon3. We found that Segment A is indeed devoid of H3 but the remaining segments of Exon3 bear H3 decorated with K4me3 modification. We also observed the presence of an active PolII at Exon3, shown on Segment A, as on the promoter of the housekeeping GAPDH gene as control, using ChIP-qPCR with an antibody specific to PolII or Ser5 phosphorylated PolII (Fig. 3i).
Our results collectively indicate that Segment A of Exon3 constitutes the core promoter element of CXXC5 located in a CGI.
Identification of proteins engaged with the CXXC5 core promoter
To evaluate proteins that potentially engage with the core promoter of CXXC5, we used a promoter pull-down approach. Nuclear extracts of MCF7 cells were incubated overnight with a 5′-end biotinylated PCR amplicon containing Segment A (220 bp in length) or a fragment of Exon10 (220 bp) as control DNA followed by incubation with streptavidin-conjugated magnetic beads. Proteins bound to beads/DNA were then subjected to MS. Subtractive analysis of MS results obtained with proteins bound to beads, the control DNA, and Segment A revealed 94 proteins that specifically associate with the core CXXC5 promoter (Fig. 4; Supplementary Information Fig. S4–S8; Supplementary Information Table 2). Analyses using STRING v1162 (https://string-db.org/) and DAVID63 (https://david.ncifcrf.gov/) databases suggest that Segment A associated-proteins are mainly grouped in the regulation of gene expression, which can further be sub-grouped into proteins as TFs and transcription co-regulatory proteins as well as proteins involved in histone/chromatin, DNA, and RNA processing (Fig. 4). Proteins identified as TFs include AFF1, ATF7, CCGBP1, CREB1, ELF1, MAZ, MGA, MYNN, NF1A, NF1B, PRDM10, TFAP2C, TPAP4, ZBTB2, ZBTB7A, ZBTB7B, ZNF596, and ZNF625. Transcription co-regulatory proteins comprise ANKRD12, ATXN7, BCOR, BRD2, BRD3, CBX8, MTA1, RBBP6, RB1, TADA2B, TRRAP, and WIZ. The group of proteins involved in the processing of RNA transcripts includes BUD31, CNOT1, DDX41, DDX49, DDX50, DDX54, RANBP2, and YBX1. The protein group associated with chromatin/histone binding, modifications, and organization as well as DNA conformational changes encompasses GATAD2A, INO80, JMJD1C, KAT2A, KDM1A, KDM2A, MCRS1, ORC5, RPA2, TAF6, TAF6L, and TOP3A.
To assess the binding of TFs obtained with the promoter pull-down approach as the putative binders to sequences of Segment A, we initially carried out bioinformatics analyses using the Cistrome (http://cistrome.org/) database, a resource of human and mouse cis-regulatory information derived from ChIP-seq, DNase-seq, and ATAC-seq chromatin profiling assays to map the genome-wide locations of transcription factor binding sites64. Due to the availability of information on TFs in Cistrome, the possible association of 16 TFs (AFF1, ATF7, CREB1, ELF1, MAZ, MGA, MYNN, NFIA, NFIB, PRDM10, RB1, TFAP2C, TFAP4, ZBTB2, ZBTB7A, and ZBTB7B) with Exon3 and surrounding sequences was analyzed with datasets generated by the use of MCF7 cells and/or of other cell lines for which datasets were available. Results revealed that while ATF7, CREB1, MGA, MYNN, NFIA, NFIB, ZBTB2, or ZBTB7B does not appear to interact with the Exon3 region, the association of ELF1, TFAP4, or TFAP2C with the region in cells seems to be dependent on tissue-of-origin. On the other hand, AFF1, MAZ, PRDM10, RB1, or ZBTB7A could be involved in the regulation of CXXC5 expression in MCF7, and also in other, cells by interacting with the Exon3 region (Supplementary Information, Fig. S9).
Interactions of TFs with Segment A in cellula
MAZ binds to DNA sequences with high G nucleotide content65,66, which are abundantly present in the CXXC5 core promoter. ELF1, upon binding to DNA could regulate gene expressions through interaction with RB167. RB1, which we identified here as one of the Segment A interacting proteins as well, indirectly associates with DNA through interactions with, for example, members of the E2F family proteins and hematopoietic transcription factors68. Based on these observations, we reasoned that ELF1 and MAZ could be involved in the regulation of the CXXC5 gene expression. We also carried out ChIP for RB1. To assess the possible presence of TFs on Segment A, we initially examined the efficiency of antibodies to precipitate the protein of interest with IB following ChIP (ChIP-IB) (Fig. 5a) and subsequently assessed the amount of isolated DNA with qPCR (ChIP-qPCR) (Fig. 5b) using primer sets specific for Segment A. We also used primers specific for the promoter of OAS1 (2′-5′-Oligoadenylate Synthetase 1) with which ELF1 is shown to interact67 as control. Similarly, primer sets for the promoter of MYC (MYC Proto-Oncogene, BHLH Transcription Factor) were used to assess the interaction of, as shown previously, MAZ69 or RB170 as control. We also used Exon2 of MB (Myoglobin) as control. In addition, ChIP using an antibody specific to CREB1 was conducted to ensure that CREB1 does not interact with the Exon3 region as the findings of the Cistrome database suggested.
Results revealed that ELF1 or RB1, as Ser5 phosphorylated PolII, indeed associates with Segment A, as each interacts with the promoter elements of the respective control gene but not with MB (Fig. 5b). MAZ synthesized endogenously in MCF7 cells displays electrophoretic mobility of about 57 kDa (Fig. 6g) that co-migrates with the heavy chain of IgG in immunoprecipitates (Supplementary Information, Fig. S10a). This renders the presence of MAZ in precipitates difficult to decipher. To ensure that the antibody, which recognizes sequences at the carboxyl-terminus of MAZ, precipitates the protein, we used an amino terminally truncated MAZ (MAZΔN) with an estimated MM of 37 kDa (Supplementary Information, Fig. S10b). MCF7 cells were transiently transfected with an expression vector bearing the HA-MAZ or HA-MAZΔN cDNA. Cells were then subjected to ChIP-IB using the MAZ antibody. The presence of HA-MAZΔN in the precipitates indicated that the antibody immunoprecipitates the MAZ protein (Supplementary Information, Fig. S10b). Based on this finding, we carried out ChIP of MCF7 cells using the MAZ antibody. qPCR results revealed that MAZ interacts with Segment A and the MYC promoter but not with Exon2 of MB (Fig. 5b).
CREB1 did not show an association with Segment A or MB but it interacted with the promoter elements of CCNA2 (Supplementary Information, Fig. S10c,d), as shown previously71.
Sequence motif analyses for ELF1 and MAZ on segment A
To examine binding sites for ELF1 or MAZ on Segment A, we performed sequence motif analyses using our motif analysis tool72 and the JASPAR (http://jaspar.genereg.net/) database73, which is a resource for curated, non-redundant TF-binding profiles stored as position frequency matrices (PFMs) for TFs. We identified potential binding sites for MAZ and ELF1 proteins in Segment A (Supplementary Information, Fig. S11a,b). Moreover, one of the characteristics of CGI promoters is the lack of sequence motifs for TATA-box or downstream promoter element (DPE) positioned at distinct locations relative to TSS that define non-CpG promoters3,4,74. Consistent with this, we found no such elements throughout the CXXC5 locus including Segment A.
To corroborate the binding to the putative ELF1 or one of the MAZ motif of Segment A, we performed electrophoretic mobility shift (EMSA) assays, as we described previously19, using a 5′-end biotin-conjugated DNA substrate containing the ELF1 or MAZ binding motif present in Segment A and nuclear extracts of MCF7 cells (Fig. 6). When the DNA substrate for ELF1 (ELF1-RE, Fig. 6a) or MAZ (MAZ-RE, Fig. 6b) was incubated with nuclear extracts, a DNA–protein complex (asterisk) was observable on the gel. The electrophoretic migration of the protein–DNA complex with the inclusion of the ELF1 or MAZ antibody further retarded the migration. These results suggest that ELF1 or MAZ specifically interacts with the DNA substrate. The abrogation of the protein-DNA interaction with a DNA substrate bearing mutant sequences (Mut DNA) or with the inclusion of a 250-fold molar excess of the unbiotinylated (UnB DNA) ELF1-RE or MAZ-RE DNA further indicates that Segment A contains sequences for the binding of ELF1 or MAZ.
To assess the effects of ELF1 or MAZ on the expression of reporter enzyme driven by Segment A, we transiently transfected MCF7 cells with the expression vector bearing the HA-ELF1 or HA-MAZ cDNA for 24 h (Fig. 6c) Results indicated that ELF1 or MAZ enhances the enzyme activity compared to levels observed with the vector. Moreover, we observed reduced levels of reporter enzyme activity driven by Segment A with the deleted ELF1 (SegAΔELF1-RE) or MAZ (SegAΔMAZ-RE) motif compared to the native sequence in MCF7 cells whether or not cells transfected with the expression vector bearing the HA-ELF1 or HA-MAZ cDNA. These results indicate that Segment A contains sequences for the binding of ELF1 and MAZ critical for the promoter activity.
In assessing the effects of ELF1 or MAZ on CXXC5 expression in a chromatin context, we transiently transfected MCF7 cells with the expression vector bearing none (EV), the HA-ELF1, or HA-MAZ cDNA for 48 h. HA-ELF1 (Fig. 6d) or HA-MAZ (Fig. 6e) augmented the expression of CXXC5 as well as the corresponding control OAS1 or MYC compared to the vector as assessed with RT-qPCR. Furthermore, the reduction of ELF1 protein levels (Fig. 6f) in transient transfections in MCF7 cells with a siRNA pool that targets ELF1 effectively attenuated the expression of CXXC5 or OAS1 (Fig. 6h). We also observed effective repression of the protein levels of MAZ by a MAZ-specific siRNA pool (Fig. 6g). Unexpectedly, however, the suppression of MAZ synthesis did not alter the CXXC5 or the MYC expression (Fig. 6i). This suggests that MAZ at steady-state conditions in contrast to ELF1 may not contribute to CXXC5 expression.
Thus, it appears that although ELF1 and MAZ participate in the expression of CXXC5, the contributory effect of these TFs on the CXXC5 expression could be mechanistically distinct and context-dependent.
Segment C may contain a G-quadruplex
Our reporter assays suggested that Segment C, which has a high G-C content, in the presence of other segments attenuates and alone represses the activity of the promoter driving the expression of the Luciferase cDNA as the reporter enzyme (Fig. 2).
G-rich sequences can self-associate into stacks of G-quartets to form complex structural motifs known as G-quadruplexes (G4s) which arise from Hoogsteen hydrogen bonding of four guanines arranged within a planar quartet (G‐quartet) linked by loop nucleotides75,76. Self‐stacking of G4 structure is further stabilized by monovalent cations, including K+75,76. G4s could play many essential functions including transcriptional events75,76. The consensus motif of G3 + N1–7G3 + N1–7G3 + N1–7G3 + N1–7 (G = guanine and N1-7 = 1–7 any nucleotide) is used to identify potential G-quadruplexes from the primary sequence76.
Analysis of the sequence of Segment C strands with G4 prediction tools including G4Hunter77 (http://bioinformatics.ibp.cz/) and G4CatchAll78 (http://homes.ieu.edu.tr/odoluca/G4Catchall/) revealed the possible presence of a G4 on the positive strand (Fig. 7a). Based on these results, we initially wanted to explore the presence of a G4 structure in Segment C using Thioflavin T (ThT) which interacts selectively with G4s resulting in a significant fluorescent enhancement79. Incubation with ThT of the putative G4 sequence of Segment C (SegC-G4), 34 nt long, in the presence of 70 mM KCl led to a substantial fluorescence increase which was determined to be as F − F0 = 720 ± 15 (Fig. 7b,c). This increase in the fluorescence was comparable to that observed with the Pu22 sequence (F-F0 = 495 ± 13) present in the promoter region of the VEGF (vascular epithelial growth factor) gene, which was previously characterized to form a G4 structure80,81. These together with low fluorescence intensities of ThT at 488 nm with various mutant sequences of SegC-G4 designed to disrupt the G4 formation (SegC-Mut1-3), SegC complementary (SegC-Comp), or dT32 suggest the possible adaptation of a G4 structure by SegC-G4 sequence.
To verify that the SegC-G4 sequence indeed forms a G4 structure, we also used the Circular Dichroism (CD) approach, which is commonly utilized to determine the G4 topology of G-rich sequences82,83. The presence of a parallel G4 is characteristically associated with a positive band around 260 nm and a negative band around 240 nm. On the other hand, the formation of an antiparallel G4 reveals a positive band around 295 and 240 nm and a negative band near 260 nm. The hybrid type G4 structure is associated with a positive band at 290 nm together with a shoulder band at 260 nm and a negative band at around 240 nm82–84. In the CD spectrum of SegC-G4, a negative peak around 240 nm, a positive peak around 260 nm and another positive peak around 290 nm were observable (Fig. 7d). The presence of a negative peak around 240 nm and a positive peak around 260 nm is an indication of a parallel G4 structure. Besides, the presence of a positive peak around 290 nm suggests the existence of a second structure with a hybrid topology. The CD spectrum of SegC-Comp reveals an intense absorption maximum around 283 nm with a negative band around 254 nm, which might be correlated with the formation of an i-motif structure due to the C-rich content of the strand85. The mutant sequences (SegC-Mut1, SegC-Mut2, and SegC-Mut3) did not show the characteristic peak intensities of G4s. Compared to SegC-Mut1 and SegC-Mut2, the high intensity of the CD spectrum at 220 nm of SegC-Mut3 likely results from the A-rich content of this sequence86.
Thermal denaturation, using spectroscopic methods, offers an approach for measuring the stability of nucleic acid structures87. CD thermal denaturation experiments were conducted to further examine the G-quadruplex structure of SegC-G4. CD spectra were recorded as a function of temperature (between 15 °C and 95 °C) (Supplementary Information Fig. S12a). Thermal denaturation profile obtained by monitoring ellipticity change at 262 nm revealed a melting temperature (Tm) of 65 °C (Supplementary Information Fig. S12). Furthermore, thermal denaturation as a function of temperature recorded with changes in UV–Vis absorbance at 295 nm, a characteristic wavelength for G-quadruplexes, revealed also a characteristic thermal denaturation curve of a G4 structure88 in SegC-G4 (Fig. 7e). Additionally, a decrease in Tm of the SegC-G4 sequence from 65 to 45 °C in the absence of 70 mM KCl as the source of K+ for stabilization of G4 structures75,76 further indicates that the SegC-G4 sequence adopts a G4 conformation. On the other hand, in agreement with our CD data, no thermal denaturation curve was obtained for SegC-Mut1, SegC-Mut2, or SegC-Mut3 sequence (Fig. 7e).
Since the fusion of Segment C to the 3′-end sequence of the CMV promoter effectively repressed the reporter enzyme activity, in contrast to Segment D which has minimal effects, induced by the promoter (Fig. 2f), we wanted to assess whether the removal of this G4 sequence in Segment C would restore the CMV-driven enzyme activity. In MCF7 cells transiently transfected with the reporter plasmid bearing CMV promoter that drives the Luciferase enzyme cDNA as the reporter, the repression of the enzyme levels by the presence of Segment C (CMV-Pr-C) but not Segment D (CMV-Pr-D) was indeed effectively alleviated with the deletion of the G4 sequence in Segment C (CΔG4) (Fig. 7f).
Discussion
The majority of human gene promoters of housekeeping, developmental and tissue‐specific genes are located within unmethylated CGIs that display a chromatin state permissive for transcription which is initiated at multiple closely spaced TSSs by ‘broad or dispersed’ promoters in contrast to ‘focused or sharp’ non-CpG promoters of cell-type-specific genes within which a single TSS initiates transcription3,4,56,74. While sequence motifs for TATA-box and downstream promoter element (DPE) positioned at distinct locations relative to TSS tend to characterize non-CpG promoters, CGI promoters generally lack these elements3,4,74. We identified here transcript variant 2 with the highest expression level among transcript variants in MCF7 and HL60 cells as the main transcript of CXXC5. We also defined a DNA segment within and at the 5’ surrounding sequences of Exon3 as the core promoter region required for CXXC5 expression. Based on DNA sequence composition and motifs, chromatin configuration of, and the presence of multiple TSSs together with an active PolII at the CXXC5 promoter, we suggest that a CGI promoter drives the expression of CXXC5.
Transcription is the result of the integrated effects of multiple inputs mediated by TFs whose activities are dynamically modulated in response to internal and external signaling cascades. Studies indicate that due to a high CpG density89 and inherently unstable nucleosome architecture90, the chromatin accessibility of CGIs is critical for the binding of various transcription factors including the ZF-CXXC family proteins and the subsequent recruitment of DNA/histone modifiers and RNA polymerase machinery for transcription3,4,74. Our studies coupled with bioinformatics analyses suggest that AFF1, ELF1, MAZ, PRDM10, TFAP2C, TFAP4, and ZBTB7A transcription factors may be involved in the regulation of the CXXC5 expression; of these, ELF1, MAZ, TFAP2C, TFAP4, and ZBTB7A appear to be capable of interacting with sequences enriched with C and/or G nucleotides within the nucleosome-free CXXC5 promoter. We verified here that ELF1 and MAZ are critical components of the CXXC5 expression by directly interacting with cognate sequence motifs present in the CXXC5 promoter.
The regulation of CXXC5 expression is likely multifactorial involving many transcription factors with activator or repressor functions responding to distinct signaling pathways. ELF1 (E74 Like ETS Transcription Factor 1), a ubiquitously expressed gene product, is a member of the ELF subfamily of the ETS transcription factor family which plays diverse roles in regulating many essential processes including embryonic development, cell cycle control, cell proliferation, apoptosis, cell migration, hematopoiesis, and angiogenesis91,92. ELF-1 interacts with a permutation of a consensus core sequence, AGGAA, (also, Supplementary Information, Fig. S11)93 on DNA and acts as an activator or repressor of target gene expressions. ELF1 could regulate gene expressions through interaction with RB1 as well67. The interaction of ELF1 with the pocket region of the hypo-phosphorylated RB1 was shown to be critical for gene expressions involved in cell cycle progression during T cell activation94. It is well established that hypo-phosphorylated RB1 restricts the ability of cells to replicate DNA by preventing G1 progression to the S phase of the cell cycle through repressing genes involved in cell cycle progression regulated by the E2F family and its obligatory dimerization partners DP family proteins through direct binding to E2F responsive elements68,95. Hyper-phosphorylation of RB1 leads to the dissociation of RB1 from E2F-DP complexes and subsequent activation of target gene expressions68,95. We observed here that ELF1 and RB1 are co-present as observed with promoter pull-down and each is enriched at the CXXC5 promoter as assessed with ChIP. Moreover, our studies revealed that ELF1 interacts with an ELF1 sequence motif on Segment A and modulates the CXXC5 expression. These observations, therefore, raise the possibility that the interaction between ELF1 and RB1 drives the CXXC5 expression in a cell cycle-dependent manner. Indeed, our ongoing studies suggest that this might be the case.
As ELF1, MAZ is expressed ubiquitously in human tissues at varying levels96. MAZ is a six Cys2-His2 zinc finger transcription factor and recognizes a permutation of a cognate sequence of GGGAGGG (also, Supplementary Information, Fig. S11) primarily present on nucleosome-free regions in broad promoters in contrast to focused promoters97. MAZ is implicated in a wide range of transcriptional roles, including transcription initiation69, transcriptional pausing of PolII during transcription elongation98, alternative splicing98,99, and transcription termination leading to the activation of polyadenylation69,100. We observed here that MAZ is enriched at the CXXC5 promoter assessed by ChIP, binds to MAZ binding motif by EMSA, and modulates the expression of CXXC5 assessed by overexpression from the reporter promoter construct or the endogenous gene locus. These suggest that MAZ is a critical contributor for the expression of CXXC5. However, we also observed that the effective reduction of MAZ protein levels by a siRNA approach did not alter the CXXC5 expression nor the expression of MYC used as the control, in contrast to the reduction in ELF1 protein levels which led to a decrease in the CXXC5 expression. This suggests that MAZ may not be involved in the expression of CXXC5 under steady-state conditions but is involved in the CXXC5 expression in response to a signaling pathway. It is also likely that the decrease/absence of MAZ might be compensated with other transcription factors that bind similarly to DNA binding motifs of MAZ. Indeed, the sequences of the binding sites for MAZ and SP1 (Specificity factor 1), which are often found within the same gene, are very similar: GGGAGGG and GGGCGG, respectively65,66. Studies further showed that SP1 binds, and competes with MAZ for binding, to the same GC-rich DNA-binding sites101.
Although how ELF1 or MAZ modulates CXXC5 expression is unclear, alterations in histone modifications, upon binding to DNA, of target gene promoters appear to be critical for gene expressions65,102. MAZ, for example, represses transcription by recruiting histone deacetylases including HDAC1, HDAC2, and HDAC365. Moreover, the interaction of FAC1 (Fetal Alzheimer's clone 1), a truncated isoform of the chromatin remodeler BPTF (bromodomain and PHD domain transcription factor), with MAZ is shown to alter the transcriptional activity of the protein103. It is therefore likely that upon association with DNA, MAZ and/or ELF1 directly or through co-regulatory proteins interact with histone modifiers, as we find here with the pull-down assay of the association of histone acetyltransferase KAT2A as well as histone demethylases JMJD1C, KDM1A, and KDM2A with Segment A, and establishes a chromatin state and structure permissive/restrictive for the transcriptional regulation of CXXC5.
While Segment A of the first exon of transcript variant 2 constitutes the core promoter for the CXXC5 gene, the surrounding regions may contribute to gene expression as they contain potential binding motifs for various transcription factors (data not shown) as well as structural features, exemplified here with the presence of a G4 conformation in Segment C. As non-canonical nucleic acid secondary structures formed within G-rich sequences in both DNA and RNA, G4s are widely found in promoter regions, immunoglobulin class switch regions, ribosomal DNA, mitochondrial DNA, replication initiation regions as well as in the extended repeat sequences in various pathologies104,105. G4s play fundamental roles in transcription, replication, genome stability, and epigenetic regulation as well as post-transcriptional events including RNA transport, localization, and translation104,105. G4 structures could act as modifiers of various TFs, as exemplified with p53106, at target promoter sites in the regulation of gene expression. The abilities of various proteins including helicases, chromatin/histone modifiers, and transcription factors to interact with G4 may be critical for the dynamic regulation of gene expressions. For example, the binding of nucleolin to the nuclease hypersensitive element III1 (NHE III1) of MYC induces the formation of a G4 structure and reduces the MYC transcription107; the binding of NME (non-metastatic cell 2; Nucleoside diphosphate kinase B), on the other hand, unfolds the G4 structure and promotes the transcription of MYC108. MAZ is also shown to bind to secondary DNA structures including G4s, which appears to be critical for transcriptional events of target genes109–111. In addition to Segment A, we also observe adjacent binding motifs for MAZ (GGGGAGGGGGAGGAGGG) in Segment C (Fig. 7A; Supplementary Information, Fig. S11). This raises the possibility that the interaction of MAZ with Segment C could also modulate the transcription of CXXC5 by forming or resolving the G4 structure.
Although promoters constitute the key platform for the assembly of pre-initiation complexes to mediate the directionality and accuracy of transcription initiation, enhancers are DNA regulatory elements that determine spatio-temporal expression even over long distances regardless of its orientation to the core promoter112,113. It is well established that enhancers acting as binding targets for lineage-specific TFs are critical components of transcription by establishing proximity interactions with promoters in a cell-type-specific manner112,113. Given the fact that transcription requires dynamic protein–protein interactions and subsequent multistep ordered assembly of protein complexes within a temporally modulated chromatin architecture, a better understanding and delineation of mechanistic features of the CXXC5 expression in response to distinct signaling pathways including retinoic acid11, TGF-β12, BMP413,14, Wnt3a15–17 and estrogen18–20 would be a valuable input for both physiology and pathophysiology.
Materials and methods
Biochemicals
Restriction and DNA modifying enzymes were obtained from New England Bio-Labs (Beverly, MA, USA) or ThermoFisher (ThermoFisher, Waltham, MA, USA). Chemicals were obtained from Sigma-Aldrich (Germany) or ThermoFisher. Pageruler Prestained Protein Ladder (ThermoFisher; 26616) or Pageruler Plus Prestained Protein Ladder (ThermoFisher; 26620) was used as the molecular mass (MM) marker.
Cell culture and transfection
MCF7 cells were grown in phenol red-free, high glucose (4.5 g/L) containing Dulbecco’s Modified Eagle’s Medium (DMEM, Lonza, Belgium, BE12-917F) supplemented with 10% fetal bovine serum (FBS, Lonza), 1% l-Glutamine (Lonza, BE17-605E) and 1% Penicillin/Streptomycin (Lonza, Belgium) as described previously19,29,114. HL60 cells derived from acute promyelocytic leukemia were grown in phenol red-free, low glucose (1 g/L) containing DMEM supplemented with 10% fetal bovine serum, 1% L-Glutamine (Lonza, BE17-605E), and 1% Penicillin/Streptomycin. MCF7 cells were transiently transfected with Turbofect transfection reagent (R0533; ThermoFisher) for 48 h if not otherwise specified. Protein concentrations in extracts were assessed with a Bradford protein assay kit (Bio-Rad Life Sciences; 5000001).
Engineering of reporter vectors
To assess the promoter activity of the genomic region of CXXC5, we generated Luciferase reporter vectors bearing a DNA fragment containing the putative promoter elements of the CXXC5 gene. We used the pGL3-Basic Luciferase Reporter vector that bears the Firefly Luciferase cDNA as the reporter enzyme (Promega Corp., Madison, WI, USA). For the engineering of the reporter vector bearing the putative CXXC5 promoter containing genomic region, a DNA fragment of 1975 bp of the CXXC5 locus (GRCh38.p12 Primary Assembly, chromosome 5: 139647220–139649173) generated by PCR using the genomic DNA of MCF7 cells as a template was inserted into the pGL3-Basic vector with appropriate restriction enzymes. To increase the resolution of the putative CXXC5 promoter region, we carried out deletions from both the 5’ and 3’ ends of the region by PCR and inserted them into the pGL3-Basic vector with appropriate restriction enzymes. All constructs were sequenced for PCR fidelity. TFF1 (also known as the pS2 gene) is a well-studied estrogen-responsive gene19,52. The human TFF1 gene confers E2 responsiveness through the binding of ER to a non-consensus ERE115–117. The pGL3-TFF1 reporter construct is responsive to E2 in transiently transfected cells synthesizing estrogen receptor(s) exemplified with MCF7 cells19,118,119. We used the pGL3-TFF1 reporter vector bearing the estrogen-responsive TFF1 gene promoter as control under the steady-state cellular growth condition in that the growth medium contains unprocessed fetal bovine serum (FBS) as opposed to charcoal–dextran treated FBS to remove endogenous steroid hormones, including estrogens. In transfections, transfection efficiency was monitored with a reporter vector bearing CMV promoter that drives the expression of the Renilla Luciferase cDNA (pCMV-RL, Promega, Corp., Madison, WI, USA). For luciferase studies, cells, 4 × 104 cells/well, were seeded in 48-well plates for 48 h. Cells were then transiently transfected with a 125 ng reporter vector together with 0.5 ng pCMV-RL using Turbofect. Luciferase assays were performed with a Dual-Luciferase Assay Kit (Promega, Corp., Madison, WI, USA) as described previously19.
PCR and RT-qPCR
Isolated total RNA from MCF7 or HL60 cells was used for the cDNA synthesis (The RevertAid First Strand cDNA Synthesis Kit, Thermo-Fisher) and transcript variant identification was carried out by PCRs with transcript variant-specific primer sets (Supplementary Information, Table 1) followed by TA-cloning into the pGEM-T vector (Promega, Corp., Madison, WI, USA) and sequencing (PRZ Biotechnology, Ankara, Turkey). Transcript variant quantification studies were carried out with RT-qPCR. The SsoAdvanced Universal SYBR Green Supermix (Bio-Rad Life Sciences Inc., Hercules, CA, USA), transcript variant-specific qPCR primers (Supplementary Information, Table 1), and DMSO when it is necessary, were used. Expression levels of transcript variants were assessed with the efficiency corrected form of the 2−ΔCT method120 and normalized using the RPLP0 expression levels. Relative expression levels of CXXC5, and ELF1- or MAZ-regulated genes were assessed using 2−ΔΔCT method120 and normalized using the RPLP0 expression levels. Results were adjusted to the expression level of transcript variant 1, which was arbitrarily chosen, as one. In all RT-qPCR experiments, MIQE Guidelines were followed121.
5′ or 3′ rapid amplification of cDNA ends (5′RACE and 3′RACE)
For 5′RACE or 3′RACE studies, we used the RiboMinus Human/Mouse Transcriptome Isolation Kit (#K155001, Thermo Scientific, USA) to enrich the mRNA concentration in the total RNA population from MCF7 cells. We performed rRNA removal according to the manufacturer’s instructions. The method is based on the selective depletion of the rRNAs by the hybridization of rRNA to Locked Nucleic Acid (LNA) probes conjugated to magnetic beads. LNA referred to as inaccessible RNA is a modified RNA nucleotide that significantly increases the hybridization properties to DNA or RNA. After the hybridization, LNA probe bound rRNAs were captured with the help of a magnetic stand and the supernatant containing largely mRNAs depleted of rRNA was recovered. Phenol:chloroform:isoamyl alcohol and ethanol precipitation was then used for the purification of mRNAs.
For the identification of 5′- and 3′-ends of CXXC5 transcripts as well as of TFF1 as control, we used the FirstChoice RLM-RACE Kit (AM1700, ThemoFisher Scientific, USA) as directed by the manufacturer. For 5’RACE, in brief, purified mRNA (500 ng) was subjected to calf intestinal alkaline phosphatase (CIP) at 37 °C for one hour. Following the termination of the CIP reaction, RNA was extracted with phenol:chloroform:isoamyl alcohol, and ethanol precipitated. Resuspended RNA in nuclease-free water was then treated with tobacco acid pyrophosphatase at 37 °C for one hour and subjected to the ligation using a 5'RACE adapter and T4 RNA ligase. Ligated RNA products were subsequently reverse transcribed with M-MLV reverse transcriptase (M-MLV-RT) using random decamers at 42 °C for one hour. An aliquot of reactions was then used for outer 5′ RLM-RACE PCR with a 5′RACE CXXC5- or TFF1-specific 5′RACE outer primer (Supplementary Information, Table 1). Outer PCR reaction was followed by Inner 5′ RLM-RACE PCR using 5'RACE CXXC5- or TFF1-specific inner primer and a 5'RACE inner primer (Supplementary Information, Table 1). For 3′RACE, purified mRNA was subjected to reverse transcription using M-MLVRT and 3′RACE Adapter provided by the kit at 42 °C for one hour. An aliquot of the reaction was used for PCR using 3′ RACE CXXC5- or TFF1-specific outer primer and 3′RACE RLM adapter outer primer (Supplementary Information, Table 1). For both 5′RACE and 3′RACE, PCR amplicons were PCR column purified (DNA Clean & Concentrator-25, D4033, Zymo Research, Irvine, CA, USA), cloned into a vector, and sequenced (PRZ, Turkey).
Northern blotting
Preparation of biotin-tagged probes
PCR amplicons for targeted identification of Exon3, Exon boundaries of Exon10 and Exon11 of CXXC5 as well as of the GAPDH cDNA fragment containing Exon5-8 were cloned into a vector. We then used biotin-conjugated vector-specific primers (Supplementary Information, Table 1) for the PCR amplification of double-stranded probe sequences. To examine the presence of CXXC5 transcripts in MCF7 and HL60, a northern blot assay was performed by the NorthernMax Kit (Thermo-Fisher, AM1940) according to the manufacturer’s instructions. In brief, 10 µg RiboMinus-treated RNA samples as well as 3 µl of RNA ladder (RiboRuler High Range RNA Ladder, # SM1821, Thermo Scientific, USA) were mixed with 3 volumes of formaldehyde loading dye and incubated at 65 °C for 15 min. Samples were then loaded to a denaturing agarose-LE gel and electrophoresed for 2 h and were transferred onto a positively charged PVDF membrane (Sigma-Aldrich, Roche, #11209272001). Transferred mRNAs were then cross-linked to the membrane with a UV transilluminator (312 nm wavelength) for 10 min. Membranes were placed into 15 ml sterile falcon tubes and pre-hybridization was initiated using 6 ml of ULTRAhyb buffer pre-heated to 42 °C in a vertical rotator for 40 min. Biotin-tagged probes were diluted tenfold with 10 mM EDTA containing TE buffer to a final volume of 100 µl and denatured at 90 °C for 10 min. The mixture was immediately added onto the membranes in falcon tubes and placed in the oven for hybridization overnight. Membranes were then washed twice with a low stringency buffer at RT for 5 min, followed by washing twice with a high stringency buffer for 15 min at 42 °C. Membranes were subsequently subjected to the Chemiluminescence Nucleic Acid Detection Module (89,880, Thermo Scientific, USA) which enables the detection of biotin-tagged nucleic acids with the utilization of HRP-conjugated streptavidin, according to the manufacturer’s instructions. Membranes were visualized with the Chemidoc MP system (Bio-Rad, USA).
Bisulfite conversion of DNA for methylation analysis
For methylation analyses of CpG rich regions of the CXXC5 locus, we subjected 500 ng of isolated gDNA from MCF7 cells to bisulfite reaction for the conversion of unmethylated cytosine residues to uracil using the EZ-DNA Methylation Lightning Kit (Zymo Research, #D5030). The bisulfite converted DNA was subsequently used as the template for PCR with bisulfite primers (Supplementary Information, Table 1) designed by the use of the MethylViewer122 tool (http://www.insilicase.com/Desktop/Methyl-Viewer.aspx). It should be noted that bisulfite primers with no or at most one CpG position conserved were readily designable for sequences of − 930 to + 103th nucleotide of Exon3 of the CXXC5 gene. However, due to the very high number of CpG positions at the center of Exon3 until the end of Exon4, designing bisulfite primers were precluded. We instead designed methyl-specific primers, which were based on methylated and unmethylated DNA sequences generated after bisulfite conversion. Methyl-specific PCRs were carried out with primer sets containing three or more CpG sites and these regions were PCR amplified as 3 overlapping segments. PCR was carried out with 2.5 units LongAmp Taq Polymerase (NEB, M0323) in 50 µl total reaction containing 0.5 µM forward and reverse primers. Template DNA was added to the reaction at 90 °C to prevent nonspecific primer annealing and the first two cycles of the reaction were carried out only in the presence of reverse primer complementary to the sense strand to avoid the formation of primer dimers. The subsequent 10 cycles were performed 5 °C above the annealing temperature followed by 30 cycles of PCR123. In all PCRs with bisulfite-converted DNA templates, we also used a bisulfite-converted “Universal Methylated Human DNA Standard” (Zymo Research, D5011) as control. PCR amplicons with expected sizes were excised from agarose gels, purified, and cloned into the pGEM-T vector (Promega Corp., Madison, WI, USA) by TA-cloning and sequenced (PRZ Biotechnology, Ankara, Turkey). Sequences were analyzed using the QUMA57 tool.
Assessing nucleosome occupancy with micrococcal nuclease assay
To assess nucleosome occupancy at the putative promoter region of the CXXC5 gene, MCF7 cells, grown in six-well plates for 48 h were fixed with 2% formaldehyde in 1xPBS for 15 min at RT with gentle shaking. To quench the formaldehyde, 0.125 M Glycine (BioShop, #GLN002) solution was added and cells were incubated for 10 min. Cells were then washed with 1xPBS twice and 0.4% Triton X-100 was added onto the cells for permeabilization. Cells were subsequently washed twice with 1 × Micrococcal Nuclease (MNase) buffer (50 mM Tris–HCl, 5 mM CaCl2, pH 7.9 at 25 °C). 500 or 1000 Gel Units (GU) of MNase (NEB, #M0247S) diluted in MNase buffer was added and cells were incubated for 30 min at 37 °C for chromatin digestion. Digestion reaction was stopped by the addition of 10 mM EGTA (Merck, 67,425). Cells were then collected in a lysis buffer containing 1% SDS, 10 mM EDTA, 50 mM Tris–HCl pH 8.0, 0.5 mM PMSF, Protease Inhibitor (PI) by scraping. To reverse crosslink, 200 mM NaCl as the final concentration was added onto cell lysates and incubated at 95 °C for 5 min. To digest RNA, 0.2 mg/ml RNAse A (Thermo Scientific, EN0531) was added onto the lysate and incubated at 37 °C for 30 min. Lastly, to digest cellular proteins, 0.25 mg/ml Proteinase K (Thermo Scientific, AM2542) was added into the lysate and incubated overnight at 65 °C. DNA was subsequently purified by phenol:chloroform:isoamyl alcohol (VWR, #K-169) and ethanol precipitation. To analyze the digestion pattern, purified DNA was loaded on an agarose gel and bands corresponding to the tri-nucleosomal and mono-nucleosomal DNAs were excised and gel purified using Zymoclean Gel DNA Recovery Kit (D4001). 50 ng of purified DNA or uncut genomic DNA was used as a template in PCR reactions to assess nucleosome occupancy. For the initial scanning, five overlapping regions (depicted as T1-5, Fig. 3) were amplified from the tri-nucleosomal DNA template. For further verification, 3 sub-regions (depicted as M1-3, Fig. 3) were analyzed from the mono-nucleosomal DNA by PCR.
Chromatin immunoprecipitation assay
ChIP assays were carried out as described previously19,119,124. In brief, MCF7 cells were grown in medium supplemented with 10% FBS in T75 tissue plates for 48 h were fixed with 1% formaldehyde at RT for 15 min and lysed with Nuclei Lysis Buffer containing 1% SDS, 10 mM EDTA, 50 mM Tris–HCl pH 8.0, 0.5 mM PMSF, 1X PIC (Roche) and actively sonicated for 20 min. Cell debris was pelleted and the supernatant was collected. After pre-clearing of nuclear extracts, the supernatant was incubated with a species-specific (Mouse or Rabbit) IgG (Santa Cruz Biotechnolohy Inc., Santa Cruz, CA, USA), PolII antibody (POLR2A, CTD4H8; Santa Cruz) for the precipitation of hypo- and hyper-phosphorylated PolII; or Ser5-PolII (POLR2A, Phospho-Rpb1 CTD, D9N5I, Cell Signaling Technology, Beverly, MA, USA) for the precipitation of Ser5 phosphorylated PolII overnight. Nuclear extracts were also incubated with an antibody specific to CREB1 (D-12, Santa Cruz sc-377154), ELF1 (B-9, Santa Cruz, sc-133210), MAZ (133.7, Santa Cruz, sc-130915), or RB1 (Retinoblastoma, 4H1, Cell Signaling Technology, Beverly, MA, USA, #9309) overnight.
ChIP assays for total Histone 3 and H3K4me3 were performed following MNase digestion (as described in Micrococcal Nuclease Assay section) and sonication for 5 min followed by the incubation with a species-specific IgG, histone H3-1B1B2 mouse mAb (Cell Signaling Technology, #14269), or Histone 3 trimethylation at lysine 4, Tri-Methyl-Histone H3 (Lys4)-C42D8 Rabbit mAb (Cell Signaling Technology, #9751).
Samples were then subjected to immunoprecipitation with Protein G-coupled magnetic beads (New England BioLabs) for anti-mouse antibodies or Protein A/G coupled magnetic beads (New England BioLabs) for anti-rabbit antibodies. After washes, de-crosslinking, and protein digestion, DNA was recovered with phenol:chloroform: isoamyl alcohol followed by ethanol precipitation. Samples (1 µl of 60 µl elution) were subjected to qPCR using ChIP primers (Supplementary Information Table 1) specific to the putative CXXC5 promoter, the promoter of OAS1 (2′-5′-Oligoadenylate Synthetase 1), MYC (MYC Proto-Oncogene, BHLH Transcription Factor), or GAPDH (as the positive control, for PolII occupancy), or the Exon2 of MB (Myoglobin) as a negative control.
Pull-down assay
Nuclear protein extraction
MCF7 cells grown in T75 flasks were trypsinized and collected by centrifugation at 1000×g for 5 min at 4 °C. Pellet was washed with ice-cold PBS twice and packed cell volume (PCV) was determined. Cells were resuspended in 1xPCV of Buffer A [Swelling Buffer: 10 mM HEPES pH 7.9, 1.5 mM MgCl2, 10 mM KCl, 0.5% NP-40, freshly added; 0.5 mM Phenylmethylsulfonyl Floride (PMSF), 0.5 mM DTT, and 1xProtease Inhibitor (PI)] and rested on ice to allow cells to swell. Cells were then lysed by passaging 25 times through a 25-gauge needle. Lysed cells were centrifuged to pellet crude nuclei at 12,000×g for 20 s at 4 °C. The crude nuclear pellet was washed twice with 1xPCV of IB [ice-cold Wash Buffer: 10 mM HEPES pH 7.9, 1.5 mM MgCl2, 10 mM KCl, freshly added; 0.5 mM Phenylmethylsulfonyl Floride (PMSF), 0.5 mM DTT, and 1xProtease Inhibitor (PI)]. After centrifugation, the pellet was resuspended in 2/3 PCV of Buffer B [ice-cold, 20 mM HEPES pH 7.9, 1.5 mM MgCl2, 420 mM KCl, 0.2 mM EDTA, 2.5% glycerol, freshly added; 0.5 mM Phenylmethylsulfonyl Floride (PMSF), 0.5 mM DTT, and 1xProtease Inhibitor (PI)] and rested on ice for 30 min with occasional agitation. Samples were then centrifuged for 5 min at 4 °C. Supernatant was then diluted isovolumetrically to decrease the salt concentration to 125 mM with Buffer D [ice-cold, 20 mM HEPES pH 7.9, 1.5 mM MgCl2, 100 mM KCl, 0.2 mM EDTA, 10% glycerol, freshly added; 0.5 mM Phenylmethylsulfonyl Floride (PMSF), 0.5 mM DTT, and 1xProtease Inhibitor (1xPI)].
Promoter pull-down
Based on luciferase reporter results, we amplified a 220 bp in length DNA fragment by PCR from the genomic DNA of MCF7 cells that includes the CXXC5 promoter (Segment A; − 117 to + 103, + 1 being the first nucleotide of the annotated Exon3) and inserted it into a vector with appropriate restriction enzyme cut sites. Similarly, a 220 bp in length DNA fragment within Exon10 of the CXXC5 gene as control DNA was cloned into the vector. 5′ end-biotinylated forward and reverse primers specific to the vector were then used for the amplification of Segment A and the control Exon10 DNA sequences by PCR. Biotinylated double-stranded PCR amplicons were recovered from agarose gels with Zymoclean Gel DNA Recovery Kit (Zymo Research).
Streptavidin magnetic beads (SMB, NEB) were blocked using 2% BSA in PBS for 2 h at 4 °C followed by washes with 1xPBS twice. The blocked SMBs were resuspended in 200 µl PBS containing 0.5 mM Phenylmethylsulfonyl Floride (PMSF), 0.5 mM DTT, and 1xPI, which were then mixed with one ml of nuclear extracts for pre-clearing for 1 h at 4 °C in 300 µl 1xPBS. Subsequently, the pre-cleared nuclear extract was divided into three equal (about 400 µl) aliquots. One aliquot of the extract was then mixed with 10 µg biotinylated double-stranded Segment A, control DNA or beads alone in the presence of 10 µg of Poly[d(I-C)] to form the protein-DNA complexes overnight at 4 °C on a rotator. The SMB–DNA–protein mixtures were subsequently washed with 1xPBS three times for 5 min each and resuspended in 200 µl 1xPBS for Mass Spectrometry (MS) analyses.
Protein identification by mass spectrometry
MS analyses of two biological replicates were carried out at the Koç University Proteomic Facility (Istanbul, Turkey). The SMB–DNA–protein mixtures were washed with 50 mM NH4HCO3, followed by reduction with 100 mM DTT in 50 mM NH4HCO3 at 56 °C for 45 min, and alkylation with 100 mM iodoacetamide at RT in the dark for 30 min. MS Grade Trypsin Protease (Pierce) was added onto the beads for overnight digestion at 37 °C (enzyme to protein ratio of 1:100). The resulting peptides were purified using C18 StageTips (ThermoFisher). Peptides were analyzed by online C18 nanoflow reversed-phase HPLC (2D nanoLC; Eksigent) linked to a Q-Exactive Orbitrap mass spectrometer (ThermoFisher). The data sets were searched against the human SWISS-PROT database version 2014_08. Proteome Discoverer (version 1.4; ThermoFisher) was used to identify proteins. The final protein lists were analyzed using the STRING v11125 and DAVID63,126 databases.
In silico analysis of TF motifs for the CXXC5 locus
Binding motifs for transcription factors
To find TF binding motifs, we developed a motif search tool72 using all the available ChIP-Seq datasets at the Cistrome64 database. This tool obtains: (1) a set of binding locations on a sample of Chip-Seq reads using MACS2 peak locations, (2) the reference sequence of the genomic locus to analyze, and (3) the binding motifs for a specific Transcription Factor from the JASPAR73 database as inputs. The program conducts and approximates string search on binding locations of the reference sequence using the consensus binding motif as the query sequence. The program generates both the forward and reverse strand hits which are ranked to a logarithmic sequence similarity score on binding locations.
Electrophoretic mobility shift assay (EMSA)
EMSA was conducted as described previously19. 5′ end biotin-labeled oligomers bearing ELF1 or MAZ binding motif sequence were purchased from Integrated DNA Technologies (IDT Europe; Belgium) and annealed. Double-stranded DNA fragments were incubated in the presence or absence of (45 μg) nuclear extracts for 15 min. Reactions were further incubated without or with the ELF1- or MAZ-specific antibody for another 15 min. Samples were subjected to electrophoresis on 5% non-denaturing polyacrylamide gel. Gel contents were subsequently electrophoretically transferred to a nylon membrane and processed for EMSA using the LightShift Chemiluminescent EMSA kit (Thermo-Fisher). In brief, the membrane was UV cross-linked and blocked for non-specific binding using a blocking buffer. The membrane was then probed with Streptavidin–Horseradish Peroxidase Conjugate in the blocking buffer for image development. Images were then captured using ChemiDoc Imaging System (Bio-Rad).
Immunoblotting (IB)
IB was carried out as described previously19,29. Briefly, cells were grown in six-well tissue culture plates in medium supplemented with 10% FBS for 48 h and transfected with a siRNA pool targeting ELF1 or MAZ and an expression vector bearing the HA-tagged ELF1 or the MAZ cDNA for 48 h. Cells were collected and protein isolation was performed using the NE-PER protein extraction kit (Thermo-Fisher). Protein concentration was determined using Bradford Proteins Assay (Bio-Rad). Nuclear extracts were subjected to denaturing SDS-PAGE, transferred to a membrane and proteins were probed with an antibody specific to ELF1 or MAZ followed by a secondary antibody conjugated with the horseradish peroxidase (Advansta). The membranes were then re-probed with the HA antibody (Abcam, ab9110) and subsequently with an HDAC1 antibody (Abcam, ab19845) as a loading control. Images were developed using the ECL-Substrate (Advansta) and captured with ChemiDoc Imaging System (Bio-Rad).
Assessment of the presence of a G-quadruplex in segment C
Sample preparation
Oligonucleotides for segment C of the CXXC5 Exon3 with a predicted G-quadruplex forming sequence (SegC-G4), its complementary strand (SegC-Comp), and mutant sequences (Mut1, Mut2, and Mut3) were purchased from Oligomer Biotechnology Inc. (Ankara, Turkey). The Pu22-G4 forming sequence of the VEGF promoter80 and dT32 were acquired from IDT. Concentrations of oligonucleotide stock solutions were determined by UV–Vis spectroscopy, using the molar extinction coefficient values obtained by IDT OligoAnalyser Tool. All nucleic acid samples were prepared in 25 mM K-phosphate buffer at pH 7.0 in the presence of 70 mM KCl in Millipore water, where the DNA concentration was 3.0 µM per strand. The nucleic acid samples were heated at 93–95 °C for 5 min and cooled down to room temperature overnight in a water bath to assure the formation of the proper secondary structures. Thioflavin T molecule (ThT) was purchased from Sigma Aldrich (St. Louis, MO, USA). ThT stock solution was prepared in Millipore water, and the concentration of the stock solution was determined by UV–vis spectroscopy using the molar extinction coefficient value of 36,000 M−1 cm−1 at 412 nm79. Igor Pro Software (WaveMetrics, Inc. Portland, OR, USA) was used for data analysis.
Circular Dichroism (CD) spectroscopy
For CD spectroscopy, a Circular Dichroism JASCO J-815 spectropolarimeter (JASCO Inc., Easton, MD, USA) equipped with a Peltier-type temperature control system was used. Spectra of all samples for comparison were recorded at 5 °C using 10 mm quartz cells (3.5 mL, 111-QS, Hellma). The CD thermal denaturation experiment for SegC-G4 was performed by varying the temperature from 15 to 95 °C (and reverse) with a 5 °C/min increment and a 1-min waiting period for each temperature point. The Tm value for SegC-G4 was determined by the differentiation of the Normalized Ellipticity (mdeg) at 262 nm vs Temperature (°C) curve.
UV–Vis absorption spectroscopy
Cary 8454 spectrophotometer (Agilent Technologies; Santa Clara, CA, USA) equipped with a Peltier-type temperature control system was used for the recording of the UV–Vis absorption spectrum of the samples. UV–Vis thermal denaturation experiments were performed by changing the temperature between 15 and 95 °C (and reverse) with 2 °C/min increments.
Fluorescence spectroscopy
Fluorescence spectroscopy was performed by Cary Eclipse Fluorescence Spectrometer (HORIBA Ltd., Kyoto, Japan). All oligonucleotide samples were prepared before experiments with the same annealing procedure described above. Parameters for the fluorescence experiments were: Emission spectra collected between 430 and 700 nm, an excitation wavelength of 412 nm, 5.0 nm excitation and emission slit widths, operation at 800 V and 600 nm/min scan rate.
For the fluorescence experiments, 0.5 µM ThT and 2.0 µM nucleic acid concentrations were selected as the optimal amounts. The results were demonstrated by plotting a bar graph of F − F0, where F0 is the fluorescence of ThT alone and F is the fluorescence of ThT after the addition of the oligonucleotides at 488 nm.
Statistical analysis
Experiments were repeated at least two independent times. Results, where and when appropriate, were presented as the mean ± standard error (S.E.) of three biological replicates. Statistical analyses were performed using a two-tailed unpaired t-test with a confidence interval, minimum, of 95%.
Supplementary Information
Acknowledgements
This work was supported by a grant from TUBITAK-KBAG 118Z957, METU-BAP 108-2021-10640. We gratefully acknowledge the critical guidance of Dr. Nurhan Özlü and Büşra Akarlar of the Koç University Proteomics Facility, Istanbul, Turkey, for the execution and analysis of mass spectrometry. We thank Öykü Deniz Demiralay, Gizem Turan and Hazal Ayten for plasmid constructions and IB. We thank members of the Muyan laboratory for stimulating discussions, contributions, and critical reading of the manuscript.
Author contributions
P.Y., G.A., and M.M. designed and oversaw the study. P.Y., G.K., K.Y., and G.A. performed and were involved in all experiments. E.B., Z.S., Ö.P.Ç. carried out G4 experiments. Ç.O., P.Y., K.Y., M.M., and T.C. wrote the motif search tool. All contributors reviewed and approved the final manuscript.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Pelin Yaşar, Gizem Kars and Kerim Yavuz.
Contributor Information
Pelin Yaşar, Email: pelinyasar@nih.gov.
Mesut Muyan, Email: mmuyan@metu.edu.tr.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-021-95165-6.
References
- 1.Defossez P-A, Stancheva I. Biological functions of methyl-CpG-binding proteins. Prog. Mol. Biol. Transl. Sci. 2011;101:377–398. doi: 10.1016/B978-0-12-387685-0.00012-3. [DOI] [PubMed] [Google Scholar]
- 2.Smith ZD, Meissner A. DNA methylation: roles in mammalian development. Nat. Rev. Genet. 2013;14:204–220. doi: 10.1038/nrg3354. [DOI] [PubMed] [Google Scholar]
- 3.Saxonov S, Berg P, Brutlag DL. A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc. Natl. Acad. Sci. 2006;103:1412–1417. doi: 10.1073/pnas.0510310103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Deaton AM, Bird A. CpG islands and the regulation of transcription. Genes Dev. 2011;25:1010–1022. doi: 10.1101/gad.2037511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Zheng Y, et al. A pan-cancer analysis of CpG Island gene regulation reveals extensive plasticity within Polycomb target genes. Nat. Commun. 2021;12:2485. doi: 10.1038/s41467-021-22720-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wang Y, et al. Methylation status of ADAM12 promoter are associated with its expression levels in colorectal cancer. Pathol. Res. Pract. 2021;221:153449. doi: 10.1016/j.prp.2021.153449. [DOI] [PubMed] [Google Scholar]
- 7.Long HK, Blackledge NP, Klose RJ. ZF-CxxC domain-containing proteins, CpG islands and the chromatin connection. Biochem. Soc. Trans. 2013;41:727–740. doi: 10.1042/BST20130028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Xiong X, Tu S, Wang J, Luo S, Yan X. CXXC5: A novel regulator and coordinator of TGF-β, BMP and Wnt signaling. J. Cell. Mol. Med. 2019;23:740–749. doi: 10.1111/jcmm.14046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Blackledge NP, Klose R. CpG island chromatin: A platform for gene regulation. Epigenetics. 2011;6:147–152. doi: 10.4161/epi.6.2.13640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Yasar P, Muyan M. CXXC5 (CXXC finger protein 5) Atlas Genet. Cytogenet. Oncol. Haematol. 2015;19:1–3. [Google Scholar]
- 11.Pendino F, et al. Functional involvement of RINF, retinoid-inducible nuclear factor (CXXC5), in normal and tumoral human myelopoiesis. Blood. 2009;113:3172–3181. doi: 10.1182/blood-2008-07-170035. [DOI] [PubMed] [Google Scholar]
- 12.Yan X, et al. CXXC5 suppresses hepatocellular carcinoma by promoting TGF-β-induced cell cycle arrest and apoptosis. J. Mol. Cell Biol. 2018;10:48–59. doi: 10.1093/jmcb/mjx042. [DOI] [PubMed] [Google Scholar]
- 13.Kim H, et al. CXXC5 is a transcriptional activator of Flk-1 and mediates bone morphogenic protein-induced endothelial cell differentiation and vessel formation. FASEB J. 2014;28:615–626. doi: 10.1096/fj.13-236216. [DOI] [PubMed] [Google Scholar]
- 14.Andersson T, et al. CXXC5 is a novel BMP4-regulated modulator of wnt signaling in neural stem cells. J. Biol. Chem. 2009;284:3672–3681. doi: 10.1074/jbc.M808119200. [DOI] [PubMed] [Google Scholar]
- 15.Kim H, et al. CXXC5 is a negative-feedback regulator of the Wnt/β-catenin pathway involved in osteoblast differentiation. Cell Death Differ. 2015;22:912–920. doi: 10.1038/cdd.2014.238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lee SH, et al. The Dishevelled-binding protein CXXC5 negatively regulates cutaneous wound healing. J. Exp. Med. 2015;212:1061–1080. doi: 10.1084/jem.20141601. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kim MY, et al. CXXC5 plays a role as a transcription activator for myelin genes on oligodendrocyte differentiation. Glia. 2016;64:350–362. doi: 10.1002/glia.22932. [DOI] [PubMed] [Google Scholar]
- 18.Nott SL, et al. Genomic responses from the estrogen-responsive element-dependent signaling pathway mediated by estrogen receptor α are required to elicit cellular alterations. J. Biol. Chem. 2009;284:15277–15288. doi: 10.1074/jbc.M900365200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Yaşar P, Ayaz G, Muyan M. Estradiol-estrogen receptor α mediates the expression of the CXXC5 gene through the estrogen response element-dependent signaling pathway. Sci. Rep. 2016;6:37808. doi: 10.1038/srep37808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Choi S, et al. CXXC5 mediates growth plate senescence and is a target for enhancement of longitudinal bone growth. Life Sci. Alliance. 2019;2:e201800254. doi: 10.26508/lsa.201800254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ravichandran M, et al. Rinf regulates pluripotency network genes and tet enzymes in embryonic stem cells. Cell Rep. 2019;28:1993–2003.e5. doi: 10.1016/j.celrep.2019.07.080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ma S, et al. Epigenetic regulator CXXC5 recruits DNA demethylase Tet2 to regulate TLR7/9-elicited IFN response in pDCs. J. Exp. Med. 2017;214:1471–1491. doi: 10.1084/jem.20161149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Aras S, et al. Oxygen-dependent expression of cytochrome c oxidase subunit 4–2 gene expression is mediated by transcription factors RBPJ, CXXC5 and CHCHD2. Nucleic Acids Res. 2013;41:2255–2266. doi: 10.1093/nar/gks1454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Li G, et al. CXXC5 regulates differentiation of C2C12 myoblasts into myocytes. J. Muscle Res. Cell Motil. 2014;35:259–265. doi: 10.1007/s10974-014-9400-2. [DOI] [PubMed] [Google Scholar]
- 25.Tsuchiya Y, et al. ThPOK represses CXXC5, which induces methylation of histone H3 lysine 9 in Cd40lg promoter by association with SUV39H1: implications in repression of CD40L expression in CD8 + cytotoxic T cells. J. Leukoc. Biol. 2016;100:327–338. doi: 10.1189/jlb.1A0915-396RR. [DOI] [PubMed] [Google Scholar]
- 26.Astori, A. et al. The epigenetic regulator RINF (CXXC5) maintains SMAD7 expression in human immature erythroid cells and sustains red blood cells expansion. Haematologica (2020). 10.3324/haematol.2020.263558 [Early view]. [DOI] [PMC free article] [PubMed]
- 27.Wang X, et al. CXXC5 associates with smads to mediate TNF-α induced apoptosis. Curr. Mol. Med. 2013;13:1385–1396. doi: 10.2174/15665240113139990069. [DOI] [PubMed] [Google Scholar]
- 28.Joshi HR, et al. Frontline science: Cxxc5 expression alters cell cycle and myeloid differentiation of mouse hematopoietic stem and progenitor cells. J. Leukoc. Biol. 2020;108:469–484. doi: 10.1002/JLB.1HI0120-169R. [DOI] [PubMed] [Google Scholar]
- 29.Ayaz G, et al. CXXC5 as an unmethylated CpG dinucleotide binding protein contributes to estrogen-mediated cellular proliferation. Sci. Rep. 2020;10:1–10. doi: 10.1038/s41598-019-56847-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Marshall PA, et al. Discovery of novel vitamin D receptor interacting proteins that modulate 1,25-dihydroxyvitamin D3 signaling. J. Steroid Biochem. Mol. Biol. 2012;132:147–159. doi: 10.1016/j.jsbmb.2012.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Zhang M, et al. The CXXC finger 5 protein is required for DNA damage-induced p53 activation. Sci. China Ser. C. 2009;52:528–538. doi: 10.1007/s11427-009-0083-7. [DOI] [PubMed] [Google Scholar]
- 32.Knappskog S, et al. RINF (CXXC5) is overexpressed in solid tumors and is an unfavorable prognostic factor in breast cancer. Ann. Oncol. 2011;22:2208–2215. doi: 10.1093/annonc/mdq737. [DOI] [PubMed] [Google Scholar]
- 33.May-Panloup P, et al. Molecular characterization of corona radiata cells from patients with diminished ovarian reserve using microarray and microfluidic-based gene expression profiling. Hum. Reprod. 2012;27:829–843. doi: 10.1093/humrep/der431. [DOI] [PubMed] [Google Scholar]
- 34.L’Hôte D, et al. Discovery of novel protein partners of the transcription factor FOXL2 provides insights into its physiopathological roles. Hum. Mol. Genet. 2012;21:3264–3274. doi: 10.1093/hmg/dds170. [DOI] [PubMed] [Google Scholar]
- 35.Treppendahl MB, Möllgård L, Hellström-Lindberg E, Cloos P, Grønbaek K. Downregulation but lack of promoter hypermethylation or somatic mutations of the potential tumor suppressor CXXC5 in MDS and AML with deletion 5q. Eur. J. Haematol. 2013;90:259–260. doi: 10.1111/ejh.12045. [DOI] [PubMed] [Google Scholar]
- 36.Centritto F, et al. Cellular and molecular determinants of all- trans retinoic acid sensitivity in breast cancer: Luminal phenotype and RARα expression. EMBO Mol. Med. 2015;7:950–972. doi: 10.15252/emmm.201404670. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Benedetti I, De Marzo AM, Geliebter J, Reyes N. CXXC5 expression in prostate cancer: implications for cancer progression. Int. J. Exp. Pathol. 2017;98:234–243. doi: 10.1111/iep.12241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Chen X, Wang X. The KN motif and ankyrin repeat domains 1/CXXC finger protein 5 axis regulates epithelial-mesenchymal transformation, metastasis and apoptosis of gastric cancer via wnt signaling. Onco Targets Ther. 2020;13:7343–7352. doi: 10.2147/OTT.S240991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Pope SD, Medzhitov R. Review emerging principles of gene expression programs and their regulation. Mol. Cell. 2018;71:389–397. doi: 10.1016/j.molcel.2018.07.017. [DOI] [PubMed] [Google Scholar]
- 40.Stadhouders R, Filion GJ, Graf T. Review transcription factors and 3D genome conformation in cell-fate decisions. Nature. 2019;569:345–354. doi: 10.1038/s41586-019-1182-7. [DOI] [PubMed] [Google Scholar]
- 41.Kornblihtt AR, et al. Alternative splicing: A pivotal step between eukaryotic transcription and translation. Nat. Rev. Mol. Cell Biol. 2013;14:153–165. doi: 10.1038/nrm3525. [DOI] [PubMed] [Google Scholar]
- 42.Ryu JY, Kim HU, Lee SY. Human genes with a greater number of transcript variants tend to show biological features of housekeeping and essential genes. Mol. Biosyst. 2015;11:2798–2807. doi: 10.1039/C5MB00322A. [DOI] [PubMed] [Google Scholar]
- 43.Landry J-R, Mager DL, Wilhelm BT. Complex controls: The role of alternative promoters in mammalian genomes. Trends Genet. 2003;19:640–648. doi: 10.1016/j.tig.2003.09.014. [DOI] [PubMed] [Google Scholar]
- 44.Sorek R. A non-EST-based method for exon-skipping prediction. Genome Res. 2004;14:1617–1623. doi: 10.1101/gr.2572604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): A curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007;35:D61–D65. doi: 10.1093/nar/gkl842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Ashurst JL, Collins JE. Gene annotation prediction and testing. Annu. Rev. Genom. Hum. Genet. 2003;4:69–88. doi: 10.1146/annurev.genom.4.070802.110300. [DOI] [PubMed] [Google Scholar]
- 47.Trinklein ND. Identification and functional analysis of human transcriptional promoters. Genome Res. 2003;13:308–312. doi: 10.1101/gr.794803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Astori A, et al. CXXC5 (retinoid-inducible nuclear factor, RINF) is a potential therapeutic target in high-risk human acute myeloid leukemia. Oncotarget. 2013;4:1438–1448. doi: 10.18632/oncotarget.1195. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Frohman MA, Dush MK, Martin GR. Rapid production of full-length cDNAs from rare transcripts: Amplification using a single gene-specific oligonucleotide primer. Proc. Natl. Acad. Sci. 1988;85:8998–9002. doi: 10.1073/pnas.85.23.8998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Raabe CA, Tang TH, Brosius J, Rozhdestvensky TS. Biases in small RNA deep sequencing data. Nucleic Acids Res. 2014;42:1414–1426. doi: 10.1093/nar/gkt1021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Yeku O, Frohman MA. Rapid amplification of cDNA ends (RACE) In: Nielsen H, editor. RNA Methods in Molecular Biology (Methods and Protocols) Humana Press; 2011. pp. 107–122. [DOI] [PubMed] [Google Scholar]
- 52.Berry M, Nunez AM, Chambon P. Estrogen-responsive element of the human pS2 gene is an imperfectly palindromic sequence. Proc. Natl. Acad. Sci. 1989;86:1218–1222. doi: 10.1073/pnas.86.4.1218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Stack G, et al. Structure and function of the pS2 gene and estrogen receptor in human breast cancer cells. Breast Cancer Cell. Mol. Biol. 1988;40:185–206. doi: 10.1007/978-1-4613-1733-3_8. [DOI] [PubMed] [Google Scholar]
- 54.Hartono SR, Korf IF, Chédin F. GC skew is a conserved property of unmethylated CpG island promoters across vertebrates. Nucleic Acids Res. 2015;43:9729–9741. doi: 10.1093/nar/gkv811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Li E, Zhang Y. DNA methylation in mammals. Cold Spring Harb. Perspect. Biol. 2014;6:a019133–a019133. doi: 10.1101/cshperspect.a019133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Brázda V, Bartas M, Bowater RP. Evolution of diverse strategies for promoter regulation. Trends Genet. 2021;1:1–15. doi: 10.1016/j.tig.2021.04.003. [DOI] [PubMed] [Google Scholar]
- 57.Kumaki Y, Oda M, Okano M. QUMA: Quantification tool for methylation analysis. Nucleic Acids Res. 2008;36:W170–W175. doi: 10.1093/nar/gkn294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Schones DE, et al. Dynamic regulation of nucleosome positioning in the human genome. Cell. 2008;132:887–898. doi: 10.1016/j.cell.2008.02.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Howe FS, Fischl H, Murray SC, Mellor J. Is H3K4me3 instructive for transcription activation? BioEssays. 2017;39:e201600095. doi: 10.1002/bies.201600095. [DOI] [PubMed] [Google Scholar]
- 60.Barski A, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129:823–837. doi: 10.1016/j.cell.2007.05.009. [DOI] [PubMed] [Google Scholar]
- 61.Guenther MG, Levine SS, Boyer LA, Jaenisch R, Young RA. A chromatin landmark and transcription initiation at most promoters in human cells. Cell. 2007;130:77–88. doi: 10.1016/j.cell.2007.05.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Szklarczyk D, et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:607–613. doi: 10.1093/nar/gky1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009;4:44–57. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
- 64.Zheng R, et al. Cistrome data browser: Expanded datasets and new tools for gene regulatory analysis. Nucleic Acids Res. 2019;47:D729–D735. doi: 10.1093/nar/gky1094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Song J, Ugai H, Kanazawa I, Sun K, Yokoyama KK. Independent repression of a GC-rich housekeeping gene by Sp1 and MAZ involves the same cis-elements. J. Biol. Chem. 2001;276:19897–19904. doi: 10.1074/jbc.M010658200. [DOI] [PubMed] [Google Scholar]
- 66.Song J, et al. Transcriptional regulation by zinc-finger proteins Sp1 and MAZ involves interactions with the same cis-elements. Int. J. Mol. Med. 2003;1:547–553. [PubMed] [Google Scholar]
- 67.Larsen S, Kawamoto S, Tanuma S, Uchiumi F. The hematopoietic regulator, ELF-1, enhances the transcriptional response to Interferon-β of the OAS1 anti-viral gene. Sci. Rep. 2015;5:17497. doi: 10.1038/srep17497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Dyson NJ. RB1: A prototype tumor suppressor and an enigma. Genes Dev. 2016;30:1492–1502. doi: 10.1101/gad.282145.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Bossone SA, Asselin C, Patel AJ, Marcu KB. MAZ, a zinc finger protein, binds to c-MYC and C2 gene sequences regulating transcriptional initiation and termination. Proc. Natl. Acad. Sci. 1992;89:7452–7456. doi: 10.1073/pnas.89.16.7452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Pietenpol JA, Mungert K, Howleyt PM, Stein RW, Moses HL. Factor-binding element in the human c-myc promoter involved in transcriptional regulation by transforming growth factor 131 and by the retinoblastoma gene product. Proc. Natl. Acad. Sci. USA. 1991;88:10227–10231. doi: 10.1073/pnas.88.22.10227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Sunkel B, et al. Integrative analysis identifies targetable CREB1/FoxA1 transcriptional co-regulation as a predictor of prostate cancer recurrence. Nucleic Acids Res. 2017;45:6993. doi: 10.1093/nar/gkx282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Oguztuzun C, Yasar P, Yavuz K, Muyan M, Can T. MotifGenie: A python application for searching transcription factor binding sequences using ChIP-Seq datasets. Bioinfomatics. 2021;1:379. doi: 10.1093/bioinformatics/btab379. [DOI] [PubMed] [Google Scholar]
- 73.Fornes O, et al. JASPAR 2020: Update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2019;48:D87. doi: 10.1093/nar/gkz1001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Vavouri T, Lehner B. Human genes with CpG island promoters have a distinct transcription-associated chromatin organization. Genome Biol. 2012;13:R110. doi: 10.1186/gb-2012-13-11-r110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Varshney D, Spiegel J, Zyner K, Tannahill D, Balasubramanian S. The regulation and functions of DNA and RNA G-quadruplexes. Nat. Rev. Mol. Cell Biol. 2020;21:459–474. doi: 10.1038/s41580-020-0236-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Lombardi EP, Londoño-Vallejo A. A guide to computational methods for G-quadruplex prediction. Nucleic Acids Res. 2020;48:1603. doi: 10.1093/nar/gkaa033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Brázda V, et al. G4Hunter web application: A web server for G-quadruplex prediction. Bioinformatics. 2019;35:3493–3495. doi: 10.1093/bioinformatics/btz087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Doluca O. G4Catchall: A G-quadruplex prediction approach considering atypical features. J. Theor. Biol. 2019;463:92–98. doi: 10.1016/j.jtbi.2018.12.007. [DOI] [PubMed] [Google Scholar]
- 79.De La Faverie AR, Guédin A, Bedrat A, Yatsunyk LA, Mergny JL. Thioflavin T as a fluorescence light-up probe for G4 formation. Nucleic Acids Res. 2014;42:e65. doi: 10.1093/nar/gku111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Agrawal P, Hatzakis E, Guo K, Carver M, Yang D. Solution structure of the major G-quadruplex formed in the human VEGF promoter in K+: Insights into loop interactions of the parallel G-quadruplexes. Nucleic Acids Res. 2013;41:10584–10592. doi: 10.1093/nar/gkt784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Bilgen E, Persil Çetinkol Ö. Doxorubicin exhibits strong and selective association with VEGF Pu22 G-quadruplex. Biochim. Biophys. Acta Gen. Subj. 2020;1864:129720. doi: 10.1016/j.bbagen.2020.129720. [DOI] [PubMed] [Google Scholar]
- 82.Carvalho J, Queiroz JA, Cruz C. Circular dichroism of G-Quadruplex: A laboratory experiment for the study of topology and ligand binding. J. Chem. Educ. 2017;94:1547–1551. doi: 10.1021/acs.jchemed.7b00160. [DOI] [Google Scholar]
- 83.Zhou B, et al. Characterizations of distinct parallel and antiparallel G-quadruplexes formed by two-repeat ALS and FTD related GGGGCC sequence. Sci. Rep. 2018;8:1–7. doi: 10.1038/s41598-018-20852-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.del Villar-Guerra R, Trent JO, Chaires JB. G-quadruplex secondary structure obtained from circular dichroism spectroscopy. Angew. Chem. Int. Ed. 2017;57:7171–7175. doi: 10.1002/anie.201709184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Zhou J, et al. Formation of i-motif structure at neutral and slightly alkaline pH. Mol. Biosyst. 2010;6:580–586. doi: 10.1039/B919600E. [DOI] [PubMed] [Google Scholar]
- 86.Kejnovská I, Kypr J, Vorlíčková M. Circular dichroism spectroscopy of conformers of (guanine + adenine) repeat strands of DNA. Chirality. 2003;15:584–592. doi: 10.1002/chir.10249. [DOI] [PubMed] [Google Scholar]
- 87.Gray RD, Chaires JB. Analysis of multidimensional G-quadruplex melting curves. Curr. Protoc. Nucleic Acid Chem. 2011;45:1–16. doi: 10.1002/0471142700.nc1704s45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Zhang Y, Chen J, Ju H, Zhou J. Thermal denaturation profile: A straightforward signature to characterize parallel G-quadruplexes. Biochimie. 2019;157:22–25. doi: 10.1016/j.biochi.2018.10.018. [DOI] [PubMed] [Google Scholar]
- 89.Wachter E, et al. Synthetic CpG islands reveal DNA sequence determinants of chromatin structure. Elife. 2014;3:e03397. doi: 10.7554/eLife.03397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Ramirez-Carrozzi VR, et al. A unifying model for the selective regulation of inducible transcription by CpG islands and nucleosome remodeling. Cell. 2009;138:114–128. doi: 10.1016/j.cell.2009.04.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Sharrocks AD. The ETS-domain transcription factor family. Nat. Rev. Mol. Cell Biol. 2001;2:827–837. doi: 10.1038/35099076. [DOI] [PubMed] [Google Scholar]
- 92.Hsing M, Wang Y, Rennie PS, Cox ME. ETS transcription factors as emerging drug targets in cancer. Med. Res. Rev. 2019;40:413–430. doi: 10.1002/med.21575. [DOI] [PubMed] [Google Scholar]
- 93.Thompson CB, et al. Cis-acting sequences required for inducible interleukin-2 enhancer function bind a novel Ets-related protein, Elf-1. Mol. Cell. Biol. 1992;12:1043–1053. doi: 10.1128/mcb.12.3.1043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Wang C, Petryniak B, Thompson C, Kaelin W, Leiden J. Regulation of the Ets-related transcription factor Elf-1 by binding to the retinoblastoma protein. Science. 1993;260:1330–1335. doi: 10.1126/science.8493578. [DOI] [PubMed] [Google Scholar]
- 95.Polager S, Ginsberg D. p53 and E2f: Partners in life and death. Nat. Rev. Cancer. 2009;9:738–748. doi: 10.1038/nrc2718. [DOI] [PubMed] [Google Scholar]
- 96.Song J, et al. Genomic organization and expression of a human gene for myc-associated zinc finger protein (MAZ) J. Biol. Chem. 1998;273:20603–20614. doi: 10.1074/jbc.273.32.20603. [DOI] [PubMed] [Google Scholar]
- 97.Nozaki T, et al. Tight associations between transcription promoter type and epigenetic variation in histone positioning and modification. BMC Genomics. 2011;12:416. doi: 10.1186/1471-2164-12-416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Robson-Dixon ND, Garcia-Blanco MA. MAZ elements alter transcription elongation and silencing of the fibroblast growth factor receptor 2 exon IIIb. J. Biol. Chem. 2004;279:29075–29084. doi: 10.1074/jbc.M312747200. [DOI] [PubMed] [Google Scholar]
- 99.Roberts G. Co-transcriptional commitment to alternative splice site selection. Nucleic Acids Res. 1998;26:5568–5572. doi: 10.1093/nar/26.24.5568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Gromak N, West S, Proudfoot NJ. Pause sites promote transcriptional termination of mammalian RNA polymerase II. Mol. Cell. Biol. 2006;26:3986–3996. doi: 10.1128/MCB.26.10.3986-3996.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Song J, et al. Two consecutive zinc fingers in Sp1 and in MAZ are essential for interactions with cis-elements. J. Biol. Chem. 2001;276:30429–30434. doi: 10.1074/jbc.M103968200. [DOI] [PubMed] [Google Scholar]
- 102.Xu M, Katzenellenbogen RA, Grandori C, Galloway DA. An unbiased in vivo screen reveals multiple transcription factors that control HPV E6-regulated hTERT in keratinocytes. Virology. 2013;446:17–24. doi: 10.1016/j.virol.2013.07.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Jordan-Sciutto KL, Dragich JM, Caltagarone J, Hall DJ, Bowser R. Fetal Alz-50 clone 1 (FAC1) protein interacts with the Myc-associated zinc finger protein (ZF87/MAZ) and alters its transcriptional activity. Biochemistry. 2000;39:3206–3215. doi: 10.1021/bi992211q. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Spiegel J, Adhikari S, Balasubramanian S. The structure and function of DNA G-quadruplexes. Trends Chem. 2020;2:123–136. doi: 10.1016/j.trechm.2019.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Kolesnikova S, Curtis EA. Structure and function of multimeric G-quadruplexes. Molecules. 2019;24:1–20. doi: 10.3390/molecules24173074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Monti P, et al. Evaluating the influence of a G-quadruplex prone sequence on the transactivation potential by wild-type and/or mutant P53 family proteins through a yeast-based functional assay. Genes. 2021;12:277. doi: 10.3390/genes12020277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Sutherland C, Cui Y, Mao H, Hurley LH. A Mechanosensor mechanism controls the G-quadruplex/i-motif molecular switch in the MYC promoter NHE III 1. J. Am. Chem. Soc. 2016;138:14138–14151. doi: 10.1021/jacs.6b09196. [DOI] [PubMed] [Google Scholar]
- 108.Thakur RK, et al. Metastases suppressor NM23-H2 interaction with G-quadruplex DNA within c-MYC promoter nuclease hypersensitive element induces c-MYC expression. Nucleic Acids Res. 2009;37:172–183. doi: 10.1093/nar/gkn919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Lew A, Rutter WJ, Kennedy GC. Unusual DNA structure of the diabetes susceptibility locus IDDM2 and its effect on transcription by the insulin promoter factor Pur-1/MAZ. Proc. Natl. Acad. Sci. 2000;97:12508–12512. doi: 10.1073/pnas.97.23.12508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Sakatsume O, et al. Binding of THZif-1, a MAZ-like zinc finger protein to the nuclease-hypersensitive element in the promoter region of the c-MYC protooncogene. J. Biol. Chem. 1996;271:31322–31333. doi: 10.1074/jbc.271.49.31322. [DOI] [PubMed] [Google Scholar]
- 111.Palumbo SL, et al. A novel G-quadruplex-forming GGA repeat region in the c-myb promoter is a critical regulator of promoter activity. Nucleic Acids Res. 2008;36:1755–1769. doi: 10.1093/nar/gkm1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Tippens ND, Vihervaara A, Lis JT. Enhancer transcription: what, where, when, and why? Genes Dev. 2018;32:1–3. doi: 10.1101/gad.311605.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Field A, Adelman K. Evaluating enhancer function and transcription. Annu. Rev. Biochem. 2020;89:213–243. doi: 10.1146/annurev-biochem-011420-095916. [DOI] [PubMed] [Google Scholar]
- 114.Muyan M, et al. Modulation of estrogen response element-driven gene expressions and cellular proliferation with polar directions by designer transcription regulators. PLoS ONE. 2015;10:e0136423. doi: 10.1371/journal.pone.0136423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Masiakowski P, et al. Cloning of cDNA sequences of hormone-regulated genes from the MCF-7 human breast cancer cell line. Nucleic Acids Res. 1982;10:2021. doi: 10.1093/nar/10.24.7895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Nunez A-M, Berry M, Imler JL, Chambon P. The 5’ flanking region of the pS2 gene contains a complex enhancer region responsive to oestrogens, epidermal growth factor, a tumor promoter (TPA), the c-Ha-ras oncoprotein and the c-jun protein. EMBO J. 1989;8:823–829. doi: 10.1002/j.1460-2075.1989.tb03443.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Jeltsch JM, et al. Structure of the human oestrogen-responsive gene pS2. Nucleic Acids Res. 1987;15:1401. doi: 10.1093/nar/15.4.1401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Yi P, Bhagat S, Hilf R, Bambara RA, Muyan M. Differences in the abilities of estrogen receptors to integrate activation functions are critical for subtype-specific transcriptional responses. Mol. Endocrinol. 2002;16:1810–1827. doi: 10.1210/me.2001-0323. [DOI] [PubMed] [Google Scholar]
- 119.Huang J, et al. Binding of estrogen receptor β to estrogen response element in situ is independent of estradiol and impaired by its amino terminus. Mol. Endocrinol. 2005;19:2696–2712. doi: 10.1210/me.2005-0120. [DOI] [PubMed] [Google Scholar]
- 120.Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-ΔΔ C(T)) Method. Methods. 2001;25:402–408. doi: 10.1006/meth.2001.1262. [DOI] [PubMed] [Google Scholar]
- 121.Bustin S, et al. The MIQE guidelines: Minimum information for publication of quantitative real-time PCR experiments. Clin. Chem. 2009;55:611–622. doi: 10.1373/clinchem.2008.112797. [DOI] [PubMed] [Google Scholar]
- 122.Pardo CE, et al. MethylViewer: Computational analysis and editing for bisulfite sequencing and methyltransferase accessibility protocol for individual templates (MAPit) projects. Nucleic Acids Res. 2011;39:e5. doi: 10.1093/nar/gkq716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Nagane Y, Utsugisawa K, Tohgi H. PCR amplification in bisulfite methylcytosine mapping in the GC-rich promoter region of amyloid precursor protein gene in autopsy human brain. Brain Res. Protoc. 2000;5:167–171. doi: 10.1016/S1385-299X(00)00008-8. [DOI] [PubMed] [Google Scholar]
- 124.Muyan M, Callahan LM, Huang Y, Lee AJ. The ligand-mediated nuclear mobility and interaction with estrogen-responsive elements of estrogen receptors are subtype specific. J. Mol. Endocrinol. 2012;49:249–266. doi: 10.1530/JME-12-0097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Szklarczyk D, et al. The STRING database in 2017: Quality-controlled protein–protein association networks, made broadly accessible. Nucleic Acids Res. 2017;45:D362–D368. doi: 10.1093/nar/gkw937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Huang DW, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009;37:1–13. doi: 10.1093/nar/gkn923. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.