Skip to main content
Molecular and Cellular Biology logoLink to Molecular and Cellular Biology
. 2022 Mar 17;42(3):e00520-21. doi: 10.1128/mcb.00520-21

Target Gene Diversity of the Nrf1-MafG Transcription Factor Revealed by a Tethered Heterodimer

Fumiki Katsuoka a,b,, Akihito Otsuki a, Nozomi Hatanaka a, Haruna Okuyama a, Masayuki Yamamoto a,b,c,
PMCID: PMC8929377  PMID: 35129372

ABSTRACT

Members of the cap’n’collar (CNC) family of transcription factors, including Nrf1 and Nrf2, heterodimerize with small Maf (sMaf) proteins (MafF, MafG, and MafK) and regulate target gene expression through CNC-sMaf-binding elements (CsMBEs). We recently developed a unique tethered dimer assessment system combined with small Maf triple-knockout fibroblasts, which enabled the characterization of specific CNC-sMaf heterodimer functions. In this study, we evaluated the molecular function of the tethered Nrf1-MafG (T-N1G) heterodimer. We found that T-N1G activates the expression of proteasome subunit genes, well-known Nrf1 target genes, and binds specifically to CsMBEs in the proximity of these genes. T-N1G was also found to activate genes involved in proteostasis-related pathways, including endoplasmic reticulum-associated degradation, chaperone, and ubiquitin-mediated degradation pathways, indicating that the Nrf1-MafG heterodimer regulates a wide range of proteostatic stress response genes. By taking advantage of this assessment system, we found that Nrf1 has the potential to activate canonical Nrf2 target cytoprotective genes when strongly induced. Our results also revealed that transposable SINE B2 repeats harbor CsMBEs with high frequency and contribute to the target gene diversity of CNC-sMaf transcription factors.

KEYWORDS: Nrf1, Nrf2, small Maf, tethered molecule, CsMBE, MafG, transcriptional regulation

INTRODUCTION

Many transcription factors form structurally related families (1), and individual members of these transcription factor families often show functional redundancy and mutual interference. Therefore, it is challenging to characterize transcription factor families. In particular, transcription factors that possess specific dimerization domains form various combinations of heterodimers and homodimers (2). However, these dimer forms exhibit intricate patterns of regulation, making functional analyses even more difficult. Consequently, it is often unclear or even controversial which heterodimers are functional among various combinations. Nonetheless, clarification of the complexity of dimer-forming transcription factors is an indispensable step for understanding finely tuned gene regulatory networks.

The cap’n’collar (CNC) family is a transcription factor family with basic-region leucine zipper (bZIP) structures. Members of this family form heterodimers with small Maf (sMaf) family transcription factors through their bZIP structures and participate in the regulation of genes involved in various biological pathways (35). Of the CNC family members, Nrf2 (NFE2L2) is a well-characterized master regulator of oxidative and xenobiotic stress responses (6). In the absence of stress, Nrf2 is efficiently degraded by Keap1-mediated proteasomal degradation, and Nrf2 activity within the cells is kept low (79). However, electrophilic chemicals or reactive oxygen species efficiently modify the cysteine residues of Keap1 and inactivate the ubiquitin ligase activity of Keap1, leading to Nrf2 accumulation in nuclei (10, 11). Nrf2 then induces a battery of cytoprotective genes, including antioxidant and xenobiotic-metabolizing enzyme genes. On the other hand, Nrf1 (NFE2L1) is known to regulate a battery of proteasome subunit genes in response to proteasome inhibition (12, 13). However, the target gene specificity of Nrf1 and Nrf2 is not fully understood. In fact, it was reported that Nrf1 regulates oxidative stress response genes (14), whereas Nrf2 regulates proteasome subunit genes (15). Thus, the target gene specificity of Nrf1 and Nrf2 remains to be clarified.

The binding consensus sequence of CNC-sMaf heterodimers has been referred to by various names, such as the NF-E2-binding element (16), the antioxidant response element (ARE), or the electrophilic response element (EpRE) (1719). These sequences share high similarity and converge to 5′-RTGASnnnGC-3′ (where R is A or G and S is G or C). Therefore, it has been proposed that they be referred to collectively as CNC-sMaf-binding elements (CsMBEs) (20, 21). In the CsMBE sequence, it is known that the underlined R is a favorable sequence for CNC binding, and the underlined GC is a favorable sequence for sMaf binding (22). Importantly, the CsMBE sequence is clearly distinct from the binding sequence of the Maf homodimer, which is referred to as the Maf recognition element (MARE) (GCTGASTCAGC). In MARE, both ends of the core sequence are GC, so Maf homodimers preferentially recognize MARE (22).

The sMaf family consists of three functionally redundant factors, MafF, MafG, and MafK, which serve as obligatory heterodimerization partners of CNC family proteins (23). The function and importance of the sMaf family proteins for CNC-mediated transcriptional regulation are supported not only by a series of biochemical studies but also by elaborate genetic studies using a number of mutant mice with a loss of function of the CNC and sMaf factors (24, 25). In particular, a triple sMaf knockout line of mice was generated and utilized for assessment (25), and the genetic analyses revealed many important features of the sMaf and CNC factor relationships. For instance, Nrf2 deficiency and sMaf deficiency independently provoke quite similar and severe impairments in the induction of antioxidant and xenobiotic-metabolizing enzyme genes in response to electrophiles, supporting the finding that Nrf2 and sMaf function as heterodimers (6, 23). Similarly, liver-specific Nrf1 deficiency and sMaf deficiency independently provoke steatohepatitis with dysregulation of genes, including proteasome subunit genes, indicating that Nrf1 and sMaf function as heterodimers (2628).

We believe that there is one caveat here: these mouse genetic studies merely provide supporting evidence that CNC factors and sMafs act in the same biological pathways, but there remains a possibility that some CNC factors may function as heterodimers with factors other than the sMaf factor, which gives rise to biological regulation similar to that for which the sMaf factors are responsible. In fact, in an in vitro analysis, Nrf1 and Nrf2 were reported to show an affinity for other bZIP factors (2). These observations led us to examine the partnership of the CNC and sMaf factors in solid and elaborate systems.

To obtain further direct evidence that sMaf factors are indispensable functional partners for CNC factors, we recently developed a unique system utilizing a tethered CNC-sMaf heterodimer together with cells deficient for all three small Maf proteins (29). Importantly, in all three sMaf-deficient cell lines, all heterodimers and/or homodimers could not function as long as their functions depended on the presence of sMafs. Therefore, when we introduced the tethered CNC-sMaf heterodimers into the cells, we were able to evaluate the function of the tethered dimers without any plausible interference from the other sMaf-containing heterodimers and sMaf homodimers. We named this system the tethered-dimer rescue system (referred to here as the TDR system). In our previous test of a tethered Nrf2-MafG (T-N2G) heterodimer using the TDR system, we found that T-N2G activated a battery of Nrf2-dependent cytoprotective genes directly through binding to CsMBEs (29), supporting the notion that MafG serves as a functional partner of Nrf2.

In this study, we aimed to test the Nrf1-MafG heterodimer in the TDR system to further validate the functional partnership of Nrf1 and sMaf factors. We also wished to examine the target gene specificity of the Nrf1-MafG heterodimer by utilizing this improved system, as the target gene profile of the Nrf1-sMaf factor has not been addressed extensively. We show here that the tethered Nrf1-MafG heterodimer (T-N1G) bound to the CsMBE and activated Nrf1 target proteasome subunit genes, demonstrating that T-N1G effectively recapitulated the function of the Nrf1-MafG heterodimer. Our analysis further revealed that in addition to canonical Nrf1 target genes, T-N1G activated various genes related to proteostasis, including those for endoplasmic reticulum (ER)-associated degradation (ERAD), chaperone, and ubiquitin-mediated degradation. We also found that transposable SINE B2 repeats harbored CsMBEs with high frequency and have contributed to the target gene diversity of the CNC-sMaf transcription factors.

RESULTS

Complementation rescue of sMaf-deficient cells by the tethered Nrf1-MafG heterodimer.

While Nrf1 is well known to regulate proteasome subunit genes (12), other aspects of the molecule have not been fully examined. We therefore decided to examine Nrf1 function comprehensively by using the TDR system, which enabled us to assess CNC-sMaf heterodimer function without interference from the other CNC and sMaf factors (Fig. 1A). To this end, we fused full-length cDNA of FLAG-tagged mouse Nrf1 with MafG cDNA by linker sequences and inserted tethered Nrf1-MafG (T-N1G) cDNA into an expression vector (Fig. 1B). Next, we introduced the T-N1G expression vector into sMaf-deficient (MafF−/−:MafG−/−:MafK−/−) mouse embryonic fibroblasts (MEFs) and established T-N1G-expressing sMaf-deficient MEFs (Fig. 1C). As a control, sMaf-deficient MEFs transfected with an empty vector were used.

FIG 1.

FIG 1

Establishment of MEFs expressing T-N1G. (A) Schematic representation of the tethered dimer rescue (TDR) system. In the absence of sMaf proteins, tethered dimers can exert their function, while other CNC factors or sMaf homodimers cannot. (B) A cDNA encoding FLAG-Nrf1 fused to MafG with a flexible polypeptide linker was inserted into the PiggyBac dual-promoter vector. (C) Schematic diagram of the protocol for obtaining MEFs expressing T-N1G. Through transfection followed by puromycin selection, control and T-N1G MEFs were established. (D) T-N1G-expressing cells were treated with the vehicle (−) or 1.0 μM MG132 (+), and their nuclear lysates were analyzed by immunoblot analysis using an anti-Nrf1 antibody. (E) The expression of T-N1G mRNA was examined by qPCR at the indicated times after 1.0 μM MG132 treatment. (F) T-N1G-mediated induction of the typical Nrf1 target genes Psma7 and Psmb4 was examined by qPCR. T-N1G-expressing cells were treated with the vehicle (−) or 1.0 μM MG132 (+). The expression level of each mRNA was normalized to that of Hprt mRNA. The data represent the means ± standard deviations (SDs) (n = 3). The gene expression levels in vehicle-treated control cells were set to 1.

To ascertain whether T-N1G-expressing cells inducibly expressed the T-N1G protein, we treated the cells with MG132, a proteasome inhibitor, and examined T-N1G protein expression by immunoblot analysis. We found that the T-N1G protein level was low under basal conditions, but the level was strongly increased by MG132 treatment (Fig. 1D). In this regard, it should be noted that the cytomegalovirus (CMV) promoter (Fig. 1B) can be activated by MG132 treatment (30). It has also been reported that the CMV promoter is activated by p38 mitogen-activated protein kinase (MAPK), which can be activated by MG132 (31, 32). Therefore, we examined whether T-N1G gene expression was influenced by MG132 treatment. Indeed, quantitative PCR (qPCR) analysis revealed that T-N1G gene expression was significantly induced in a time-dependent manner after MG132 treatment (Fig. 1E). This observation implies that the accumulation of the T-N1G protein was caused by both the upregulation of T-N1G gene expression and the repression of T-N1G protein degradation. Nonetheless, the comparison between MG132 treatment and vehicle treatment allowed us to evaluate the effect of T-N1G on target gene expression, and we conducted the present analyses.

We then examined whether T-N1G was functional, as it could activate a set of Nrf1 target genes. To this end, we selected two representative proteasome subunit genes, Psma7 and Psmb4, and performed qPCR analysis to assess their mRNA expression. MG132 treatment significantly induced the expression of the Psma7 and Psmb4 genes in a T-N1G-dependent manner (Fig. 1F). These results thus demonstrate that the T-N1G protein retained the transactivation activity, so we could exploit this T-N1G TDR system for the assessment of Nrf1-sMaf heterodimer function.

T-N1G widely activates a battery of proteasome subunit genes.

We then tried to obtain a comprehensive gene set whose expression was induced by T-N1G. For this purpose, we treated T-N1G and control cells with two different concentrations of MG132 (0.3 and 1.0 μM) as well as the vehicle. We then extracted RNAs and performed RNA sequencing (RNA-Seq) (Fig. 2A) analysis. We then conducted differential gene expression analyses and generated a heat map showing the expression of approximately 3,100 genes that were significantly upregulated in T-N1G-expressing cells under either basal or MG132-treated conditions (Fig. 2B). Notably, MG132 treatment provoked changes of various magnitudes in gene expression.

FIG 2.

FIG 2

T-N1G activates a battery of proteasome subunit genes. (A) Schematic diagram of the protocol for RNA-Seq. RNA was extracted from the control and T-N1G cells treated with the vehicle (Veh) or MG132 (0.3 or 1.0 μM) and subjected to RNA-Seq. (B) Heat map of the relative expression levels of genes that showed dose-dependent increases specifically in T-N1G-expressing cells (group I), genes whose expression was upregulated specifically in T-N1G-expressing cells before MG132 treatment (group II), and genes whose expression was induced by MG132 even in control cells at a magnitude similar to that in T-N1G-expressing cells (group III) (false discovery rate [FDR] of <0.05; log2 fold change [FC] of >0.7). Colors indicate the log2 FC values relative to the median expression level of each gene in control cells without MG132 treatment. (C) Pathway analysis of the T-N1G-dependent genes using KEGG data sets. The counts represent the number of T-N1G-dependent genes involved in each pathway, and the P values represent modified Fisher’s exact P values. (D) Heat map of the relative expression levels of proteasome subunit genes induced in a T-N1G-dependent manner. Heat map colors are shown in the same way as in panel B.

We found multiple profiles of changes in gene expression in this set of gene expression analyses. The expression of almost half of the genes showed dose-dependent increases specifically in T-N1G-expressing cells (group I) (upper part of the heat map). We refer to these genes as T-N1G-dependent genes. We also found genes whose expression was induced by MG132 even in control cells without T-N1G to a magnitude similar to that in T-N1G-expressing cells (group III) (bottom half of the heat map). Intriguingly, there were genes whose expression was upregulated specifically in T-N1G-expressing cells even before MG132 treatment (group II) (middle part of the heat map). The expression of many of these genes was significantly suppressed by MG132 treatment, while that of some genes was not suppressed but rather induced by MG132 treatment. We also refer to the latter group II genes as T-N1G-dependent genes. Thus, we defined the sum of group I and some group II genes, i.e., 1,547 genes, as the T-N1G-dependent gene set (Fig. 2B).

To assess how well the T-N1G-dependent gene set covers Nrf1 target genes, we next performed pathway analyses utilizing the KEGG Pathway database. The top enriched pathway in the T-N1G-dependent gene set was the proteasome pathway (Fig. 2C). When we examined the expression of each gene in the proteasome pathway, many proteasome subunit genes were found to be activated by MG132 in a T-N1G-dependent manner (Fig. 2D). This gene set includes genes encoding alpha/beta subunits of the 20S proteasome and regulatory subunits of the 26S proteasome. Based on these results, we conclude that the T-N1G-dependent gene set reasonably covers the canonical Nrf1 target genes.

Genome-wide binding analysis of T-N1G.

To obtain insights into how T-N1G regulates the expression of a set of genes, we next performed a genome-wide analysis of the binding sites of the T-N1G protein via chromatin immunoprecipitation sequencing (ChIP-Seq) analyses. To this end, chromatin extracts from T-N1G-expressing cells treated with or without MG132 were subjected to immunoprecipitation followed by deep sequencing analysis (Fig. 3A). We identified 3,894 and 6,576 T-N1G-binding sites in the two replicate experiments with MG132 treatment. We surmise that the differences in the numbers of identified sites in these two experiments might be inherent to the methodology and not significant in further analyses. Thus, 3,389 sites, identified in both experiments, were used for further analysis (Fig. 3B).

FIG 3.

FIG 3

Genome-wide analysis of T-N1G-binding sites. (A) Schematic diagram of the protocol for ChIP-Seq. Chromatin extracts from T-N1G-expressing cells treated with 1.0 μM MG132 or the vehicle (Veh) were subjected to immunoprecipitation followed by next-generation sequencing analysis. (B) Venn diagram showing the overlap of the T-N1G-binding sites between two replicate experiments (Exp. 1 and Exp. 2). (C) Heat map of normalized ChIP-Seq signals from vehicle-treated cells and MG132-treated cells from the two experiments. (D) Representative T-N1G-binding sites in Psma5 and Psmb3 gene loci in T-N1G-expressing cells treated with the vehicle and MG132 (MG). (E) Consensus sequences of MARE, core MARE, CsMBE, and CsMBE-related sequences. (F) Pie chart showing the percentages of core MARE, the strictly defined CsMBE, and CsMBEs with mismatches of 0 and 1. (G) A sequence logo of TGASTCAGC identified in the T-N1G-binding sites is shown with their 5′- and 3′-flanking sequences.

Upon inspection of the heat map plots of normalized ChIP-Seq signals, we clearly understood that the majority of the T-N1G-binding sites were MG132 inducible (Fig. 3C). In Fig. 3D, we show two representative cases in which inducible T-N1G binding was observed. In the promoter-proximal regions of the Psma5 and Psmb3 genes, we found specific binding of T-N1G. As similar binding profiles were observed by endogenous Nrf1 ChIP-Seq analysis (33), these results support our hypothesis that the T-N1G protein indeed recapitulated the binding profile of the Nrf1-sMaf heterodimer.

The consensus sequence of CsMBEs is 5′-RTGASnnnGC-3′ (Fig. 3E) (where R is A/G and S is G/C). The underlined R is a favorable sequence for CNC binding, while the underlined GC is a favorable sequence for sMaf binding (22). Importantly, CsMBEs are clearly distinct from MARE (Fig. 3E) (TGCTGASTCAGCA), in which both ends have GC sequences, so sMaf homodimers preferentially recognize MARE (Fig. 3E). Therefore, the ideal consensus sequence of CsMBEs would be RTGASTCAGCA, which would bind CNC and Maf factors most favorably, and we refer to this sequence as the strictly defined CsMBE. Here, we also defined GCTGASTCAGC as a core MARE motif and RTGASTCAGC as a CsMBE with 0 mismatches (CsMBE-m0).

To examine how often CsMBEs appear in the identified T-N1G-binding regions, we searched for the presence of CsMBEs among the identified T-N1G-binding regions. We found that approximately 75% of the regions harbored the strictly defined CsMBE, and 7% and 8% of the regions harbored CsMBE-m0 and CsMBE-m1 (CsMBE with 1 mismatch in the nnn region), respectively (Fig. 3E and F). We also searched for the presence of MARE in the identified T-N1G-binding regions and found that approximately 1% of the regions contain the core MARE (Fig. 3E and F). These results thus indicate that CsMBEs are dominant in the T-N1G-binding regions.

To further investigate the details of the flanking sequences of CsMBEs, we first extracted TGASTCAGC motifs, which could be strictly defined CsMBE, CsMBE-m0, or core MARE depending on the surrounding sequence (Fig. 3E). We extracted 3,278 core TGASTCAGC motifs from within the T-N1G-binding sites. Next, we examined the base frequencies of 6 and 7 bases at the 5′ and 3′ ends of the core TGASTCAGC motif of 3,278 T-N1G-binding sites. As shown in Fig. 3G, R (A/G) is the major base in the 5′ position adjacent to the core TGASTCAGC motif, while A is the major base in the 3′ position adjacent to the core motif. The 3′ region (7 bp) of the core motif is AT rich. These results thus demonstrate that T-N1G binds to the canonical CNC-sMaf-binding elements that have been identified in various previous studies, supporting the conclusion that T-N1G and the Nrf1-MafG heterodimer indeed bind to CsMBEs.

T-N1G contributes to proteostasis-related pathways.

As we obtained both the T-N1G-dependent gene expression profile and the T-N1G genome-wide binding profile, we surmised that the integration of these data would reveal important biological pathways that are directly regulated by T-N1G. In this integrated analysis, we utilized T-N1G genome-wide binding sites observed in both or either of two replicate experiments. Of the 1,547 T-N1G-dependent genes identified through RNA-Seq, we highlighted 686 genes that had T-N1G-binding sites in their vicinity (approximately 100 kb on both sides) and designated them T-N1G-dependent genes with ChIP-Seq peaks (T-N1G-genes-w/peaks) (Fig. 4A).

FIG 4.

FIG 4

T-N1G activates genes related to the ER, chaperone, ubiquitin-related degradation, and RNA metabolism. (A) Venn diagram showing the overlap between T-N1G-dependent genes (pink) and genes close to T-N1G-binding sites (blue). (B) Functional classification of 380 T-N1G-genes-w/peaks. The names of PANTHER protein classes and protein subclasses (with a minus sign at the beginning) are shown. The number in parentheses is the number of genes classified in each protein class. (C) Heat map of the relative expression levels of genes induced in a T-N1G-dependent manner. Heat map colors are shown in the same way as in Fig. 2B.

We then performed functional annotations of the T-N1G-genes-w/peaks by referring to the PANTHER protein class database. As shown in Fig. 4B and Table S1 in the supplemental material, we annotated 418 genes based on the PANTHER protein classes. The protein class with the highest number of genes was “protein-modifying enzyme,” containing 87 genes. Other predominant protein classes included “metabolite interconversion enzyme,” with 62 genes; “nucleic acid metabolism protein,” with 50 genes; “gene-specific transcriptional regulator,” with 40 genes; and “transporter,” with 36 genes. These results suggest that the Nrf1-MafG heterodimer directly regulates a wide range of genes.

Since PANTHER annotation has limitations, we then conducted manual annotation of the 686 T-N1G-genes-w/peaks to determine whether the range of Nrf1 target genes could be expanded. We found that the T-N1G-genes-w/peaks included many genes in the categories endoplasmic reticulum (ER)-related protein processing, ubiquitin-related degradation, and chaperone functions (Fig. 4C). These results suggest the intriguing possibility that Nrf1 regulates a broad range of proteostatic stress response genes.

Interestingly, the T-N1G-genes-w/peaks also included genes in the category of RNA metabolism, which included RNA helicase and RNA splicing factor genes (Fig. 4C). As the RNA splicing process is modulated during cellular adaptation to heat stress, causing protein denaturation (34), the Nrf1-MafG heterodimer may also contribute to the proteostatic stress response by modulating RNA splicing. We conclude based on these broad observations that the Nrf1-MafG heterodimer contributes to a wide range of proteostatic stress responses, including RNA metabolism, directly through binding to CsMBEs.

The mutual relationship or interchangeability of Nrf1 and Nrf2 has been explored, but at this point, it remains an enigma. Based on the results described above, we wondered whether the expression of proteostatic stress response genes was also regulated by the Nrf2-sMaf heterodimer. To address this point, we assessed our RNA-Seq data obtained from tethered Nrf2-MafG heterodimer (T-N2G)-expressing cells with or without treatment with the Nrf2 inducer diethyl maleate (DEM) (29). We found that most of the T-N1G-induced proteostatic stress response genes were not activated substantially by DEM in the T-N2G-dependent system (Fig. 4C, right). These results suggest that proteostatic stress response genes are regulated mainly by Nrf1-sMaf heterodimers but not Nrf2-sMaf heterodimers.

T-N1G binds to the proximal region of proteostatic stress response genes.

To further explore the contribution of T-N1G to the expression of proteostatic stress response genes, we conducted closer examinations of the ChIP-Seq peaks in a set of T-N1G-genes-w/peaks. We evaluated the binding of T-N1G to these genes with regard to the induction ability, location relative to the transcriptional start site, and the presence of CsMBEs. We also compared the ChIP-Seq peaks of T-N1G with those of T-N2G.

As shown in Fig. 5, we first focused on the Vcp gene, encoding valosin-containing protein (Vcp). Vcp is known to form a complex with polyubiquitin-chain-binding proteins, NPL4 homolog, ubiquitin recognition factor (Nploc4), and ubiquitin recognition factor in ER-associated degradation 1 (Ufd1), and contributes to the ERAD of proteins (35). In the Vcp locus, two binding peaks of T-N1G were identified, which were markedly induced by MG132. One was in the promoter-proximal region, the other was in the intronic region, and the latter harbored one CsMBE-m0. Notably, we also identified a binding peak of T-N1G in the Nploc4 gene in an MG132-inducible manner; a strictly defined CsMBE resided in the intron of the Nploc4 gene located approximately 50 kb downstream from the transcription start site (TSS), suggesting that Nrf1 directly contributes to the expression of the ERAD system.

FIG 5.

FIG 5

T-N1G-binding sites found in the proximal region of genes related to the ER, chaperone, ubiquitin-related degradation, and RNA metabolism. Representative T-N1G-binding profiles in selected gene loci in the T-N1G-expressing cells treated with the vehicle (Veh) and MG132 (MG) are shown with fragments per kilobase per million reads (FPKM) (y axis) expression data for each gene in control and T-N1G-expressing cells treated with the vehicle (−) or 0.3 μM (+) and 1.0 μM (++) DEM. The binding profile of T-N2G in the T-N2G-expressing cells treated with the vehicle and DEM was retrieved from a previous study. All tracks in a given comparison have the same scaling factor for the y axis, as indicated in the left-hand region of each track. The locations and sequences of each CsMBE (in red) relative to the TSS are shown (kilobase pairs).

MG132-inducible T-N1G binding was also found in proximal regions of ER-related genes (Rad23b [RAD23 homolog B, nucleotide excision repair protein] and Sec24d [SEC24 homolog D, COPII coat complex component]), chaperone genes (Tbce [tubulin-specific chaperone E] and Tbcel [tubulin-specific chaperone E-like]), ubiquitin-related genes (Ube4a [ubiquitination factor E4A], Usp47 [ubiquitin-specific peptidase 47], and Usp14 [ubiquitin-specific peptidase 14]), and RNA metabolism-related genes (Dhx35 [DEAH box helicase 35]). The CsMBE, CsMBE-m0, or CsMBE-m1 was localized in these genes, and these CsMBEs are shown as red characters in the sequences in Fig. 5. Closer inspection revealed that some genes had T-N1G-binding peaks in their promoter-proximal regions, while the other genes had peaks far away from the TSS. There were genes with multiple T-N1G-binding sites, while some genes retained only one T-N1G-binding site. Thus, the binding sites for MG132-induced T-N1G appear to be highly pleiotropic.

These observations prompted us to ask whether these T-N1G-binding patterns resulted in diversity in transcriptional activity. To answer this question, we examined the expression level of each gene by the normalized read counts from the RNA-Seq analysis. We interpreted the results as indicating that the higher the number of counts was, the more actively the genes were transcribed, as long as the stability of the mRNAs did not change substantially. Notably, the induction levels of these genes varied significantly. The number of T-N1G-binding sites or their distance from the TSS did not correlate directly with the enhancing activity of T-N1G induced by MG132 treatment. For instance, the Sec24d gene harbored four T-N1G-binding sites, while Nploc4 harbored only one, but transcriptional activation by MG132 treatment was more significant in the latter. While the Tbce and Nploc4 genes retained single T-N1G-binding sites approximately +5.5 kbp and +50 kbp downstream from the TSS, respectively, T-N1G-dependent transcriptional activation was much stronger in the latter case. These results suggest that T-N1G-mediated transcriptional activation meets the definition of an enhancer, which functions independently of its position (36), and that T-N1G activity collaborates with various regulatory machineries in individual contexts.

To gain insights into the molecular mechanisms underlying target gene selection by CNC-sMaf heterodimers, we asked whether T-N2G binds similarly to the regulatory regions identified by T-N1G binding within proteostatic stress response genes. To this end, we revisited the ChIP-Seq data obtained from the T-N2G-expressing cells (29). As a result, T-N2G binding with reasonable strength was observed at the T-N1G-binding sites in the Nploc4, Rad23b, Sec24d, Tbce, and Tbcel genes, but the binding of T-N2G was very weak at sites in the Vcp, Ube4a, Usp47, Usp14, and Dhx35 genes compared to that of T-N1G (Fig. 5). These results support the notion that while T-N2G binds to some of the CsMBEs identified by T-N1G binding within proteostatic stress response genes, it could not participate effectively and efficiently in the transcriptional regulation of these genes, presumably due to a lack of cooperation with coactivators.

T-N1G-binding regions overlap the binding region of XBP1.

To further delineate the roles that T-N1G plays in the transcriptional network, we explored transcription factors that share binding specificities with the T-N1G protein. To this end, we compared the T-N1G-binding profiles with publicly available ChIP-Seq data deposited in the Cistrome DB Toolkit (37). This toolkit yields a Giggle score, which represents the similarity of transcription factor-binding sites. As a result, we identified that the binding sites of Nrf1, MafF, MafG, and MafK showed a high degree of overlap with those of T-N1G (Fig. 6A). This observation strongly indicates that the binding profiles of T-N1G reflect the properties of Nrf1-sMaf heterodimers. In addition, Nrf2, NF-E2 p45, and Bach2 were identified as factors whose binding sites overlapped those of T-N1G, suggesting that various CNC-sMaf heterodimers indeed bind to DNA in an overlapping manner.

FIG 6.

FIG 6

T-N1G-binding sites partially overlap XBP1-binding sites. (A) Overlap of the T-N1G-binding sites with publicly available data, as determined by the Cistrome DB Toolkit. The Giggle score represents the significance of the overlap between the T-N1G peaks and the Cistrome DB sample peaks. (B) Heat map of normalized ChIP-Seq signals of T-N1G from experiment 1 and XBP1 from control liver (vehicle [Veh]) (SRA accession no. SRR4064260) and liver obtained 6 h after partial hepatectomy (PH) (accession no. SRR4064262) in the T-N1G-binding regions. (C) Representative T-N1G- and XBP1-binding sites in the Vcp, Ube4a, and Tbce gene loci in MG132-treated T-N1G-expressing cells (T-N1G) and livers 6 h after partial hepatectomy (XBP1).

Notably, among the other factors that showed binding similarity with T-N1G, a master regulator of the unfolded-protein stress response, XBP1 (38), was identified. The binding profile of XBP1 utilized here was taken from a previous ChIP-Seq analysis examining the XBP1-binding profile in mouse livers in response to partial hepatectomy (39). According to that study, the unfolded-protein stress response is activated during liver regeneration, resulting in the activation of XBP1. To further confirm the binding similarity of T-N1G and XBP1, we obtained raw XBP1 ChIP-Seq data from the GEO database (accession no. GSE86048) and examined the XBP1 binding profile to the region where T-N1G binds. As shown in a heat map in Fig. 6B, some XBP1-binding sites overlapped those of T-N1G well. Strong XBP1 binding was observed at the strong T-N1G-binding sites. Detailed analysis revealed that there was a significant overlap between XBP1 and T-N1G in the regulatory regions of stress response genes related to proteostasis, such as Vcp, Ube4a, and Tbce (Fig. 6C). These results thus demonstrate that XBP1 binding overlaps that of Nrf1-sMaf heterodimers, suggesting a possible interrelationship between these two factors.

T-N1G has the potential to activate Nrf2-dependent oxidative and xenobiotic stress response genes.

Taking advantage of the T-N1G system, we evaluated how T-N1G influences the expression of Nrf2 target cytoprotective genes. It should be noted, however, that in this study, we analyzed Nrf1 function in a pure background with tethered MafG but without the influence of any other sMaf or CNC transcription factor contributions.

When we focused on genes that were previously found to be induced by T-N2G and DEM (Fig. 7A, right), T-N1G was found to be able to induce the expression of a major group of T-N2G target and DEM-inducible genes in the presence of MG132 (Fig. 7A, left). On the other hand, the expression of the other Nrf2 target group genes, including fucose-1-phosphate guanylyltransferase (Fpgt), glucose-6-phosphate dehydrogenase X-linked (G6pdx), and 1,4-alpha-glucan branching enzyme 1 (Gbe), was not induced significantly by T-N1G and MG132. It should be noted that the induction pattern differed from gene to gene. For instance, the expression levels of NAD(P)H quinone dehydrogenase 1 (Nqo1) and glutathione S-transferase Mu 2 (Gstm2) were already high at steady state in T-N1G-expressing cells, and they did not change substantially after MG132 treatment. Since these genes were repressed by MG132 in control cells, it is presumed that the induction level of these genes was determined by the competition between activation mediated by T-N1G and repression mediated by some unknown factor(s).

FIG 7.

FIG 7

T-N1G activates Nrf2-dependent cytoprotective genes. (A) Heat map of the relative expression levels of Nrf2-dependent cytoprotective genes induced in a T-N1G- and/or T-N2G-dependent manner. Heat map colors are shown in the same way as in Fig. 2B. (B) Representative T-N1G- and T-N2G-binding sites in the Nqo1, Txnrd1, Fth1, Gclc, Gsta4, Slc48a1, and Me1 gene loci in T-N1G-expressing cells treated with the vehicle (Veh) and MG132 (MG) and in T-N2G-expressing cells treated with the vehicle and DEM.

Through ChIP-Seq analysis, we found that T-N1G binds with significant strength to the regulatory regions of representative Nrf2 target genes, such as Nqo1, thioredoxin reductase 1 (Txnrd1), ferritin heavy chain 1 (Fth1), glutamate-cysteine ligase catalytic subunit (Gclc), and NADP-dependent malic enzyme (Me1) (Fig. 7B). These T-N1G-binding sites overlapped those of T-N2G (Fig. 7B). The binding level of T-N1G in these gene loci was comparable to or higher than that in Psm5a and Psmb3 gene loci (Fig. 3D and Fig. 7B), demonstrating that the tethered Nrf1-MafG heterodimer has the potential to bind and activate Nrf2 target genes.

An important conclusion inferred from these results is that in the regulation of transcription factors, the availability of the factors or their family members in various situations is critically important. While Nrf1 can bind and activate the majority of Nrf2 target genes, Nrf1 is not induced strongly under oxidative and electrophilic stress conditions. Thus, these results clearly indicate that the regulation of expression acquired by individual factors during molecular evolution is a critical determinant of their functional roles, rather than the individual molecular characteristics acquired by the transcription factor family members.

Some T-N1G-binding sites overlap SINE B2 family repeats.

Through a closer examination of the T-N1G ChIP-Seq data by de novo motif analysis, we found that the T-N1G-binding sequences included a specific combination of motif 1 and motif 2. Motif 1 corresponds to the CsMBE, and motif 2 is a unique motif (Fig. 8A). Motif 1 or the CsMBE was significantly enriched at the center of the tested regions, while motif 2 was enriched at regions approximately 25 bp away from motif 1 or the CsMBE (Fig. 8B). Importantly, we discovered that motif 2 showed a substantial overlap of sequences in the SINE B2 repeats, including B2_Mm2 and B3, which are registered in the Dfam repeat database (40) (Fig. 8A).

FIG 8.

FIG 8

A portion of the T-N1G-binding sites overlapped SINE B2 family repeats. (A) Motif 1 and motif 2, which were identified by de novo motif analysis of the T-N1G-binding sites, are shown with the strictly defined CsMBE and a portion of SINE B2_Mm2/B3 sequences that show homology in each motif, respectively. (B) Average profiles of the two enriched motifs in the T-N1G-binding sites. (C, left) Pie chart showing the proportion of T-N1G-binding regions that overlapped SINE B2 family repeats in all T-N1G-binding regions. (Right) Pie chart showing the proportion of each SINE B2 family repeat as a percentage of the total number of SINE B2 family repeats overlapping the T-N1G-binding regions. (D) Consensus sequences of the 5′ ends of SINE B3, B2_Mm2, B3A, B2_Mm1t, and B2_Mm1a. CsMBE-like sequences and motif 2-like sequences are encircled by blue and red lines, respectively. (E) Pie chart showing the proportion of CsMBE-m0 overlapping the SINE B2 family repeats in CsMBE-m0 identified in the genome. (F) The location of CsMBE-like sequences in the SINE B2 family repeats is shown with counts.

Therefore, we next examined how widely the T-N1G-binding sites overlapped the SINE B2 family repeats. Of the 3,389 identified T-N1G-binding regions, 1,155 regions harbored at least one repeat belonging to the five SINE B2 families, while the rest of the 2,234 regions did not harbor any such SINE B2 family sequences (Fig. 8C). Of these 1,155 SINE B2 family sequences, B3, B2_Mm2, and B3A repeats were relatively abundant (Fig. 8C).

To confirm the relationship between CsMBEs and SINE B2 elements, we conversely examined whether the majority of the SINE B2 family sequences retained the CsMBE-like sequences. For this purpose, we explored CsMBE-like motifs within the consensus sequences of the five known SINE B2 repeat sequences utilizing the Dfam repeat sequence database. To our surprise, we found a highly conserved CsMBE-related sequence located at the beginning of SINE B2 repeats, and the CsMBE-related sequence was highly conserved within all five SINE B2 family sequences (Fig. 8D). These CsMBE-like sequences were followed by motif 2 sequences in which sequence conservation among the five subclasses was relatively moderate compared to that among CsMBE-like sequences.

Further inspection of the conserved CsMBE-like sequences within the five SINE B2 repeats revealed that the A in the fourth position of CsMBE (RTGASTCAGC [underlined position]) appeared to be G (dot in Fig. 8D). In addition, in the B2_Mm1a repeats, the last position of CsMBE (RTGASTCAGC [underlined]) appeared to be T. Based on these observations, we propose the following intriguing hypothesis: the SINE B2 family contains a conserved sequence that shares similarity with CsMBEs, and this original sequence acquired mutations and evolved to CsMBEs during the course of molecular evolution.

To address this hypothesis, we explored CsMBEs that are embedded in SINE B2 repeats throughout the mouse genome. While the mouse genome contains 51,489 CsMBEs (CsMBE-m0 in Fig. 3E), 4,280 overlaps were found between CsMBE-m0 and SINE B2 repeats (Fig. 8E). The majority of the SINE B2 repeats harbored CsMBEs 21 bp away from the start sites of their defined sequences (Fig. 8F). In this regard, it is interesting to note that the T-N2G (i.e., tethered NRF2-MafG)-binding sites also overlapped SINE B2 family repeats (29). These wide-ranging observations support the notion that some of the CsMBEs were derived from the CsMBE-like motifs in the SINE B2 repeats, implying that the SINE B2 elements contributed to the genome-wide expansion and generation of the diversity of CsMBEs.

SINE B2-associated CsMBEs are found in the proximity of specific genes, including proteostatic stress response genes.

We next explored genes whose regulatory regions harbor T-N1G-binding sites overlapping SINE B2 repeats. In Table S1, we have included a column that shows whether individual genes harbored the T-N1G-binding site(s) overlapping the SINE B2 repeat(s). Importantly, most of the T-N1G-binding sites found in the proteasome subunit genes did not overlap SINE B2 repeats (Table S1). Similarly, most of the canonical Nrf2 target genes, including cytoprotective and metabolic genes (shown in Fig. 7A), did not harbor a T-N2G-binding sequence linked with SINE B2 repeats. These results suggest that CsMBEs in the proximity of canonical Nrf1 target genes as well as CsMBEs in the proximity of canonical Nrf2 target genes perhaps did not originate from SINE B2 repeats. In contrast, some of the CsMBEs might have been added to the new CNC-sMaf target genes or distal positions of classic CNC-sMaf target genes through mobile SINE repeats, and these CsMBEs likely contribute to the increase in the diversity of CsMBEs and/or CNC-sMaf target gene regulation.

Indeed, we found in a limited number of cases that T-N1G-binding sites in proteasome subunit genes or in T-N1G-genes-w/peaks were linked with SINE B2 repeats. For instance, in the two T-N1G-binding sites found in the Psmd4 locus, the distal site that included the strictly defined CsMBE resided in a SINE B2 repeat, B3 (Fig. 9A). Similarly, in the two T-N1G-binding sites found in the Vcp locus, the distal site that included CsMBE-m0 resided in a SINE B2 repeat, B2_Mm2 (Fig. 9B). However, it remains to be determined how strongly the SINE B2-associated CsMBE contributes to the expression of the Psmd4 and Vcp genes.

FIG 9.

FIG 9

The locations and sequences of the SINE B2 family repeats identified in the proximity of the Psmd4 (A), Vcp (B), Ube4a (C), and Sec24d (D) genes are shown with T-N1G binding profiles. Magnified views of the RepeatMasker track where SINE B2 repeats are located are shown. Arrows indicate magnified regions. The locations of CsMBE- and motif 2-like sequences are boxed with blue and red lines, respectively.

On the other hand, in the Ube4a gene, there are two T-N1G-binding sites. The promoter-proximal site harbors the SINE B2 repeat (Fig. 9C), suggesting that the SINE B2-associated CsMBE contributes primarily to the expression of the Ube4 gene. In addition, SINE B2-associated CsMBEs were found in the proximity of some proteostatic stress response genes, including the Sec24d gene (Fig. 9D and Table S1). Taken together, these results support the hypothesis that SINE B2 increased the diversity of Nrf1 target genes by spreading the CsMBE-like motifs genome-wide during molecular evolution.

DISCUSSION

Transcription factors often function by forming specific heterodimers (2). Deciphering the functions of such heterodimers is one of the major challenges for understanding gene regulatory networks where multiple transcription factors intersect. In this study, we addressed this issue by using the CNC-sMaf transcription factor family and the tethered dimer rescue (TDR) system. In particular, we evaluated the molecular function of T-N1G in comparison with that of T-N2G. Integration of the RNA-Seq and ChIP-Seq data revealed that T-N1G activates the expression of proteasome subunit genes, well-known Nrf1 target genes, and binds specifically to CsMBEs in the proximity of the genes. Importantly, as summarized in Fig. 10, T-N1G was also found to activate genes involved in proteostasis-related pathways, indicating that the Nrf1-MafG heterodimer regulates a wide range of proteostatic stress response genes. Taking advantage of the TDR assessment system, we found that Nrf1 has the potential to activate canonical Nrf2 target genes when induced strongly, indicating that gene/protein expression regulation is more critical for CNC factor selection than differences in the molecular characteristics per se. These results thus provide critical information for the functional contributions of transcription factor heterodimers. In addition, we found that transposable SINE B2 repeats harbor CsMBEs with high frequency and have contributed to the target gene diversity of the CNC-sMaf transcription factors.

FIG 10.

FIG 10

Target gene selection diversity and specificity of Nrf1-MafG and Nrf2-MafG heterodimers. In response to proteostatic stresses, Nrf1-MafG activates not only proteasome subunit genes but also a variety of genes involved in the proteostatic stress response. On the other hand, in response to oxidative and electrophilic stresses, the Nrf2-MafG heterodimer activates a battery of antioxidant and xenobiotic-metabolizing enzyme genes. While this study showed that the Nrf1-MafG heterodimer has the potential to activate antioxidant and xenobiotic-metabolizing enzyme genes, this remains to be further evaluated.

One of our concerns before starting this series of studies was whether artificially tethered heterodimers, i.e., T-N1G and T-N2G, truly recapitulate the function of Nrf1-sMaf and Nrf2-sMaf heterodimers. Our current study using T-N1G clearly revealed that proteasome subunit genes known to be regulated by the Nrf1-sMaf heterodimer were identified as T-N1G-dependent target genes in the RNA-Seq experiment, and T-N1G was shown to bind in the proximity of these genes through CsMBEs. Similarly, our previous study revealed that most of the Nrf2-sMaf target cytoprotective genes were T-N2G target genes, and T-N2G binds in the proximity of these genes through CsMBEs. Based on these results, we conclude that the tethered dimers recapitulate the CNC-sMaf heterodimer functions.

One of the salient findings in this study is that the Nrf1-MafG heterodimer regulates the expression of genes related to ER-related protein processing, various E3 ligases, and chaperone and RNA metabolism. Our current finding is consistent with previous reports showing that Nrf1 directly regulates the ERAD-related genes Vcp and Nploc4 (33, 41). Another intriguing observation in this study, identified via ChIP-Seq analyses, was that T-N1G-binding regions overlap, at least in part, the XBP1-binding regions. XBP1 is a critical transcription factor that participates in the regulation of the ER stress response or unfolded-protein response (38). This observation suggests that there may be cross talk between two factors to execute finely tuned gene regulation to cope with proteostatic stress.

Whether the target genes of Nrf1 and Nrf2 are similar or distinct has long been controversial. We showed in this study that T-N1G has the potential to activate Nrf2 target antioxidant and xenobiotic-metabolizing enzyme genes in a stable transfection experiment. This observation is consistent with previous studies that showed that Nrf1 can activate a set of Nrf2 target genes (14, 42). On the other hand, it has also been observed that Nrf1 does not contribute significantly to the expression of oxidative/electrophilic stress response genes under physiological conditions. For instance, the expression of oxidative/electrophilic stress response genes was not affected substantially in Nrf1-deficient mouse livers, while the expression of proteasome subunit genes was significantly impaired (27). Importantly, the expression of the former genes was significantly impaired in Nrf2-deficient livers (27), indicating that the contribution of Nrf2 to the stress response is more significant than that of Nrf1 in the liver.

In this regard, it should be noted that in our TDR system, the molecular nature of each CNC-sMaf tethered dimer was examined purely in the absence of sMaf proteins, and the various interactions with or influences of the other bZIP factors were eliminated. However, we surmise that under physiological and/or pathological conditions, specific role sharing might occur between Nrf1 and Nrf2. Under extreme conditions such as acute arsenic exposure (14), Nrf1 may act as a regulator of the oxidative stress response, which is usually controlled by Nrf2. We also surmise that the perturbation of proteostasis under physiological conditions is different from that caused by strong proteasome inhibitors. Given that MG132 activates stress kinases (32), we also need to consider the possibility that MG132-induced T-N1G cooperatively activates transcription in collaboration with other MG132-activated factors. The identification of an endogenous activator of Nrf1 will provide a better understanding of the physiological function of Nrf1.

We identified that rodent-specific transposable SINE B2 elements harbor CsMBEs with high frequency, and these elements serve as binding sites for T-N1G. We previously noticed in the analysis of T-N2G-binding sites that some SINE B2 elements included CsMBEs (29). We also noticed that the human transposable element LINE1 subfamilies L1ME and L1MB frequently overlap CsMBEs (43). In this study, we extensively examined the relationship of SINE B2 elements and CsMBEs and found that while CsMBEs were frequently identified in the SINE B2 elements, the CsMBEs found in the major Nrf1 and Nrf2 target genes did not show a close relationship with the SINE B2 elements. Similarly, human CsMBEs linked to major Nrf2 target genes are largely irrelevant to LINE1 elements (43). These findings led us to hypothesize that the SINE B2 and LINE1 families of transposable elements contribute to the diversity of CNC-sMaf-mediated gene regulation in mice and humans, respectively, during molecular evolution. These findings also suggest that the CNC-sMaf regulatory network for major Nrf1 and Nrf2 target genes was established in a much earlier stage of molecular evolution than that by which these transposable elements act to increase diversity. In the latter situation, a gene in the vicinity of the randomly inserted SINE B2/LINE1 element might obtain CsMBEs or CsMBE-like sequences by acquiring point mutations so that at certain times, the gene becomes a stress response gene. If the newly acquired gene regulation was beneficial to the individual, the gene regulation was maintained during the course of evolution. Although the biological significance of each transposable element-associated CsMBE remains to be determined by further analysis, such as cis-element editing with the CRISPR/Cas9 system, we propose that SINE B2 and LINE1 elements in mice and humans, respectively, provide an add-on-type response to oxidative, electrophilic, and proteostatic stresses to realize adaptations to the environment.

In summary, the functional analysis of T-N1G suggests that the Nrf1-MafG heterodimer has the potential to regulate a wide range of stress response genes, including oxidative, xenobiotic, and proteostatic stresses. However, we believe that CsMBE-mediated gene regulation is governed by the balance of CNC and sMaf factors, some of which are under strict inducible regulation. CNC-sMaf-based gene expression regulation should be interpreted in the context of the molecular nature and intracellular abundance of the factors. A comprehensive analysis of the abundance of each CNC and sMaf protein, together with fundamental knowledge of each heterodimer function, will provide a better understanding of CNC-sMaf-mediated gene expression regulation.

MATERIALS AND METHODS

Plasmid constructs.

A T-N1G expression vector was generated by PCR. In the first PCR, FLAG-mouse Nrf1 and mouse MafG cDNAs were fused with a portion of the linker peptide sequences. Using these PCR products as the templates, T-N1G cDNA was amplified by the second PCR with primers containing sequences homologous to the expression vector. By using the In-Fusion system (TaKaRa Bio), the resulting PCR product was inserted into the PiggyBac dual-promoter vector containing red fluorescent protein (RFP) and a puromycin selection marker (catalog no. PB514B-2; System Biosciences). Primer sequences are available upon request.

Establishment of mouse embryonic fibroblasts stably expressing T-N1G.

Immortalized sMaf-deficient mouse embryonic fibroblasts (MEFs) were previously established from MafF−/−:MafG−/−:MafK−/− mice (29). To obtain MEFs stably expressing the T-N1G protein, the T-N1G expression vector was cotransfected with a transposase expression vector into sMaf-deficient MEFs. At 48 h posttransfection, cells were treated with puromycin (5 μg/mL) for the selection of stably transfected cells. The expression of RFP was confirmed. Control cells, into which the PiggyBac empty vector was introduced, were previously established (29).

Immunoblot analysis.

Nuclear lysates were prepared by using NE-PER nuclear and cytoplasmic extraction reagents (Thermo Scientific), solubilized in Laemmli SDS sample buffer, and subjected to SDS-PAGE followed by semidry electroblotting onto an Immobilon-P membrane (Millipore, Bedford, MA). The expression of T-N1G was detected using an anti-Nrf1 antibody (clone D5B10; Cell Signaling Technology). An anti-lamin B1 antibody (catalog no. 33-2000; Zymed) was used as a loading control for the nuclear protein.

RNA purification and qPCR analyses.

Total RNA was extracted using a Qiagen RNeasy minikit and reverse transcribed into cDNA using a SuperScript Vilo cDNA synthesis kit (Life Technologies, Carlsbad, CA). qPCR was performed with PCR master mix using SYBR green (Life Technologies, Carlsbad, CA). The primers and probes for Psma7 and Psmb4 and hypoxanthine-guanine phosphoribosyltransferase (Hprt) were described previously (44).

RNA sequencing.

Sequencing libraries were prepared from 2 μg of total RNA by using a SureSelect strand-specific RNA sample prep kit (Agilent) and quantified by the quantitative MiSeq (qMiSeq) method (45). Sequencing was performed on a HiSeq 2500 sequencing system (Illumina), generating 76-bp single-end reads. Mapping and expression analyses were performed with TopHat and Cufflinks. Pathway analysis was conducted using the KEGG database on the DAVID platform (46). RNA sequencing (RNA-Seq) data for T-N2G-expressing cells treated with or without 100 μM DEM were retrieved from our previous study (29).

ChIP-Seq analysis.

Chromatin preparation was performed as described above. Briefly, cross-linked cells were lysed, and the nuclear fractions were extracted and sonicated with a focused ultrasonicator (catalog no. S220; Covaris) to generate 100- to 300-bp DNA fragments. The immunoprecipitation reaction was performed with an anti-MafG antibody (47). Purified ChIP DNA was used as the input DNA for preparing the sequencing libraries by using a NEBNext Ultra II DNA library prep kit (New England BioLabs [NEB]). The resulting libraries were quantified by qMiSeq and sequenced on a HiSeq 2500 sequencing system (Illumina), generating 101-bp single-end reads. Mapping and peak calling analyses were performed with Bowtie2 and MACS2. ChIP-Seq data for XBP1 were obtained from the NCBI SRA database under accession no. SRR4064260 for control liver and SRR4064262 for liver with partial hepatectomy. Heat maps of the normalized read density were generated by deepTools (48). De novo motif analysis was performed using MEME-ChIP (49). A motif search was performed using FIMO (50). WebLogo was used to show the frequency of CsMBE sequences (51). The comparison of the ChIP peaks with publicly available ChIP data was performed with the Cistrome Toolkit (37). ChIP-Seq data for T-N2G-expressing cells treated with or without 100 μM DEM were retrieved from our previous study (29).

Data availability.

All sequencing data are present in the NCBI GEO Superseries under accession no. GSE188788.

ACKNOWLEDGMENTS

We thank members of the Department of Integrative Genomics and the Department of Medical Biochemistry for their support and helpful discussions.

This work was supported in part by MEXT/JSPS KAKENHI (grant no. 19H05649 to M.Y. and grant no. 20K07335 to F.K.).

F.K. and M.Y. designed the research. F.K., A.O., N.H., and H.O. conducted the experiments and analyzed the data. F.K. and M.Y. wrote the paper.

We declare no conflicts of interest.

Footnotes

Supplemental material is available online only.

Supplemental file 1
Table S1. Download mcb.00520-21-s0001.xlsx, XLSX file, 0.06 MB (65.7KB, xlsx)

Contributor Information

Fumiki Katsuoka, Email: kfumiki@med.tohoku.ac.jp.

Masayuki Yamamoto, Email: masiyamamoto@med.tohoku.ac.jp.

REFERENCES

  • 1.Lambert SA, Jolma A, Campitelli LF, Das PK, Yin Y, Albu M, Chen X, Taipale J, Hughes TR, Weirauch MT. 2018. The human transcription factors. Cell 172:650–665. doi: 10.1016/j.cell.2018.01.029. [DOI] [PubMed] [Google Scholar]
  • 2.Newman JR, Keating AE. 2003. Comprehensive identification of human bZIP interactions with coiled-coil arrays. Science 300:2097–2101. doi: 10.1126/science.1084648. [DOI] [PubMed] [Google Scholar]
  • 3.Suzuki T, Yamamoto M. 2015. Molecular basis of the Keap1-Nrf2 system. Free Radic Biol Med 88:93–100. doi: 10.1016/j.freeradbiomed.2015.06.006. [DOI] [PubMed] [Google Scholar]
  • 4.Katsuoka F, Yamamoto M. 2016. Small Maf proteins (MafF, MafG, MafK): history, structure and function. Gene 586:197–205. doi: 10.1016/j.gene.2016.03.058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Yamamoto M, Kensler TW, Motohashi H. 2018. The KEAP1-NRF2 system: a thiol-based sensor-effector apparatus for maintaining redox homeostasis. Physiol Rev 98:1169–1203. doi: 10.1152/physrev.00023.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Itoh K, Chiba T, Takahashi S, Ishii T, Igarashi K, Katoh Y, Oyake T, Hayashi N, Satoh K, Hatayama I, Yamamoto M, Nabeshima Y. 1997. An Nrf2/small Maf heterodimer mediates the induction of phase II detoxifying enzyme genes through antioxidant response elements. Biochem Biophys Res Commun 236:313–322. doi: 10.1006/bbrc.1997.6943. [DOI] [PubMed] [Google Scholar]
  • 7.Kobayashi A, Kang MI, Okawa H, Ohtsuji M, Zenke Y, Chiba T, Igarashi K, Yamamoto M. 2004. Oxidative stress sensor Keap1 functions as an adaptor for Cul3-based E3 ligase to regulate proteasomal degradation of Nrf2. Mol Cell Biol 24:7130–7139. doi: 10.1128/MCB.24.16.7130-7139.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zhang DD, Lo SC, Sun Z, Habib GM, Lieberman MW, Hannink M. 2005. Ubiquitination of Keap1, a BTB-Kelch substrate adaptor protein for Cul3, targets Keap1 for degradation by a proteasome-independent pathway. J Biol Chem 280:30091–30099. doi: 10.1074/jbc.M501279200. [DOI] [PubMed] [Google Scholar]
  • 9.Wakabayashi N, Itoh K, Wakabayashi J, Motohashi H, Noda S, Takahashi S, Imakado S, Kotsuji T, Otsuka F, Roop DR, Harada T, Engel JD, Yamamoto M. 2003. Keap1-null mutation leads to postnatal lethality due to constitutive Nrf2 activation. Nat Genet 35:238–245. doi: 10.1038/ng1248. [DOI] [PubMed] [Google Scholar]
  • 10.Suzuki T, Muramatsu A, Saito R, Iso T, Shibata T, Kuwata K, Kawaguchi S-I, Iwawaki T, Adachi S, Suda H, Morita M, Uchida K, Baird L, Yamamoto M. 2019. Molecular mechanism of cellular oxidative stress sensing by Keap1. Cell Rep 28:746–758.e4. doi: 10.1016/j.celrep.2019.06.047. [DOI] [PubMed] [Google Scholar]
  • 11.Dinkova-Kostova AT, Kostov RV, Canning P. 2017. Keap1, the cysteine-based mammalian intracellular sensor for electrophiles and oxidants. Arch Biochem Biophys 617:84–93. doi: 10.1016/j.abb.2016.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Radhakrishnan SK, Lee CS, Young P, Beskow A, Chan JY, Deshaies RJ. 2010. Transcription factor Nrf1 mediates the proteasome recovery pathway after proteasome inhibition in mammalian cells. Mol Cell 38:17–28. doi: 10.1016/j.molcel.2010.02.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Waku T, Katayama H, Hiraoka M, Hatanaka A, Nakamura N, Tanaka Y, Tamura N, Watanabe A, Kobayashi A. 2020. NFE2L1 and NFE2L3 complementarily maintain basal proteasome activity in cancer cells through CPEB3-mediated translational repression. Mol Cell Biol 40:e00010-20. doi: 10.1128/MCB.00010-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Zhao R, Hou Y, Xue P, Woods CG, Fu J, Feng B, Guan D, Sun G, Chan JY, Waalkes MP, Andersen ME, Pi J. 2011. Long isoforms of NRF1 contribute to arsenic-induced antioxidant response in human keratinocytes. Environ Health Perspect 119:56–62. doi: 10.1289/ehp.1002304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kwak MK, Wakabayashi N, Greenlaw JL, Yamamoto M, Kensler TW. 2003. Antioxidants enhance mammalian proteasome expression through the Keap1-Nrf2 signaling pathway. Mol Cell Biol 23:8786–8794. doi: 10.1128/MCB.23.23.8786-8794.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Mignotte V, Wall L, deBoer E, Grosveld F, Romeo PH. 1989. Two tissue-specific factors bind the erythroid promoter of the human porphobilinogen deaminase gene. Nucleic Acids Res 17:37–54. doi: 10.1093/nar/17.1.37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Andrews NC, Erdjument-Bromage H, Davidson MB, Tempst P, Orkin SH. 1993. Erythroid transcription factor NF-E2 is a haematopoietic-specific basic-leucine zipper protein. Nature 362:722–728. doi: 10.1038/362722a0. [DOI] [PubMed] [Google Scholar]
  • 18.Friling RS, Bensimon A, Tichauer Y, Daniel V. 1990. Xenobiotic-inducible expression of murine glutathione S-transferase Ya subunit gene is controlled by an electrophile-responsive element. Proc Natl Acad Sci USA 87:6258–6262. doi: 10.1073/pnas.87.16.6258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Rushmore TH, Morton MR, Pickett CB. 1991. The antioxidant responsive element. Activation by oxidative stress and identification of the DNA consensus sequence required for functional activity. J Biol Chem 266:11632–11639. doi: 10.1016/S0021-9258(18)99004-6. [DOI] [PubMed] [Google Scholar]
  • 20.Otsuki A, Suzuki M, Katsuoka F, Tsuchida K, Suda H, Morita M, Shimizu R, Yamamoto M. 2016. Unique cistrome defined as CsMBE is strictly required for Nrf2-sMaf heterodimer function in cytoprotection. Free Radic Biol Med 91:45–57. doi: 10.1016/j.freeradbiomed.2015.12.005. [DOI] [PubMed] [Google Scholar]
  • 21.Otsuki A, Yamamoto M. 2020. cis-element architecture of Nrf2-sMaf heterodimer binding sites and its relation to diseases. Arch Pharm Res 43:275–285. doi: 10.1007/s12272-019-01193-2. [DOI] [PubMed] [Google Scholar]
  • 22.Kimura M, Yamamoto T, Zhang J, Itoh K, Kyo M, Kamiya T, Aburatani H, Katsuoka F, Kurokawa H, Tanaka T, Motohashi H, Yamamoto M. 2007. Molecular basis distinguishing the DNA binding profile of Nrf2-Maf heterodimer from that of Maf homodimer. J Biol Chem 282:33681–33690. doi: 10.1074/jbc.M706863200. [DOI] [PubMed] [Google Scholar]
  • 23.Katsuoka F, Motohashi H, Ishii T, Aburatani H, Engel JD, Yamamoto M. 2005. Genetic evidence that small maf proteins are essential for the activation of antioxidant response element-dependent genes. Mol Cell Biol 25:8044–8051. doi: 10.1128/MCB.25.18.8044-8051.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Katsuoka F, Motohashi H, Tamagawa Y, Kure S, Igarashi K, Engel JD, Yamamoto M. 2003. Small Maf compound mutants display central nervous system neuronal degeneration, aberrant transcription, and Bach protein mislocalization coincident with myoclonus and abnormal startle response. Mol Cell Biol 23:1163–1174. doi: 10.1128/MCB.23.4.1163-1174.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Yamazaki H, Katsuoka F, Motohashi H, Engel JD, Yamamoto M. 2012. Embryonic lethality and fetal liver apoptosis in mice lacking all three small Maf proteins. Mol Cell Biol 32:808–816. doi: 10.1128/MCB.06543-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Xu Z, Chen L, Leung L, Yen TS, Lee C, Chan JY. 2005. Liver-specific inactivation of the Nrf1 gene in adult mouse leads to nonalcoholic steatohepatitis and hepatic neoplasia. Proc Natl Acad Sci USA 102:4120–4125. doi: 10.1073/pnas.0500660102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hirotsu Y, Hataya N, Katsuoka F, Yamamoto M. 2012. NF-E2-related factor 1 (Nrf1) serves as a novel regulator of hepatic lipid metabolism through regulation of the Lipin1 and PGC-1beta genes. Mol Cell Biol 32:2760–2770. doi: 10.1128/MCB.06706-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Katsuoka F, Yamazaki H, Yamamoto M. 2016. Small Maf deficiency recapitulates the liver phenotypes of Nrf1- and Nrf2-deficient mice. Genes Cells 21:1309–1319. doi: 10.1111/gtc.12445. [DOI] [PubMed] [Google Scholar]
  • 29.Katsuoka F, Otsuki A, Takahashi M, Ito S, Yamamoto M. 2019. Direct and specific functional evaluation of the Nrf2 and MafG heterodimer by introducing a tethered dimer into small Maf-deficient cells. Mol Cell Biol 39:e00273-19. doi: 10.1128/MCB.00273-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Li X, Chen D, Yin S, Meng Y, Yang H, Landis-Piwowar KR, Li Y, Sarkar FH, Reddy GP, Dou QP, Sheng S. 2007. Maspin augments proteasome inhibitor-induced apoptosis in prostate cancer cells. J Cell Physiol 212:298–306. doi: 10.1002/jcp.21102. [DOI] [PubMed] [Google Scholar]
  • 31.Bruening W, Giasson B, Mushynski W, Durham HD. 1998. Activation of stress-activated MAP protein kinases up-regulates expression of transgenes driven by the cytomegalovirus immediate/early promoter. Nucleic Acids Res 26:486–489. doi: 10.1093/nar/26.2.486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Meriin AB, Gabai VL, Yaglom J, Shifrin VI, Sherman MY. 1998. Proteasome inhibitors activate stress kinases and induce Hsp72. Diverse effects on apoptosis. J Biol Chem 273:6373–6379. doi: 10.1074/jbc.273.11.6373. [DOI] [PubMed] [Google Scholar]
  • 33.Baird L, Tsujita T, Kobayashi EH, Funayama R, Nagashima T, Nakayama K, Yamamoto M. 2017. A homeostatic shift facilitates endoplasmic reticulum proteostasis through transcriptional integration of proteostatic stress response pathways. Mol Cell Biol 37:e00439-16. doi: 10.1128/MCB.00439-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Shalgi R, Hurt JA, Lindquist S, Burge CB. 2014. Widespread inhibition of posttranscriptional splicing shapes the cellular transcriptome following heat shock. Cell Rep 7:1362–1370. doi: 10.1016/j.celrep.2014.04.044. [DOI] [PubMed] [Google Scholar]
  • 35.Meyer HH, Wang Y, Warren G. 2002. Direct binding of ubiquitin conjugates by the mammalian p97 adaptor complexes, p47 and Ufd1-Npl4. EMBO J 21:5645–5652. doi: 10.1093/emboj/cdf579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Banerji J, Rusconi S, Schaffner W. 1981. Expression of a beta-globin gene is enhanced by remote SV40 DNA sequences. Cell 27:299–308. doi: 10.1016/0092-8674(81)90413-x. [DOI] [PubMed] [Google Scholar]
  • 37.Zheng R, Wan C, Mei S, Qin Q, Wu Q, Sun H, Chen CH, Brown M, Zhang X, Meyer CA, Liu XS. 2019. Cistrome data browser: expanded datasets and new tools for gene regulatory analysis. Nucleic Acids Res 47:D729–D735. doi: 10.1093/nar/gky1094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Yoshida H, Nadanaka S, Sato R, Mori K. 2006. XBP1 is critical to protect cells from endoplasmic reticulum stress: evidence from site-2 protease-deficient Chinese hamster ovary cells. Cell Struct Funct 31:117–125. doi: 10.1247/csf.06016. [DOI] [PubMed] [Google Scholar]
  • 39.Argemí J, Kress TR, Chang HCY, Ferrero R, Bértolo C, Moreno H, González-Aparicio M, Uriarte I, Guembe L, Segura V, Hernández-Alcoceba R, Ávila MA, Amati B, Prieto J, Aragón T. 2017. X-box binding protein 1 regulates unfolded protein, acute-phase, and DNA damage responses during regeneration of mouse liver. Gastroenterology 152:1203–1216.e15. doi: 10.1053/j.gastro.2016.12.040. [DOI] [PubMed] [Google Scholar]
  • 40.Hubley R, Finn RD, Clements J, Eddy SR, Jones TA, Bao W, Smit AF, Wheeler TJ. 2016. The Dfam database of repetitive DNA families. Nucleic Acids Res 44:D81–D89. doi: 10.1093/nar/gkv1272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Cui M, Atmanli A, Morales MG, Tan W, Chen K, Xiao X, Xu L, Liu N, Bassel-Duby R, Olson EN. 2021. Nrf1 promotes heart regeneration and repair by regulating proteostasis and redox balance. Nat Commun 12:5270. doi: 10.1038/s41467-021-25653-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Kwong M, Kan YW, Chan JY. 1999. The CNC basic leucine zipper factor, Nrf1, is essential for cell survival in response to oxidative stress-inducing agents. Role for Nrf1 in gamma-gcs(l) and gss expression in mouse fibroblasts. J Biol Chem 274:37491–37498. doi: 10.1074/jbc.274.52.37491. [DOI] [PubMed] [Google Scholar]
  • 43.Ishida N, Aoki Y, Katsuoka F, Nishijima I, Nobukuni T, Anzawa H, Bin L, Tsuda M, Kumada K, Kudo H, Terakawa T, Otsuki A, Kinoshita K, Yamashita R, Minegishi N, Yamamoto M. 2020. Landscape of electrophilic and inflammatory stress-mediated gene regulation in human lymphoblastoid cell lines. Free Radic Biol Med 161:71–83. doi: 10.1016/j.freeradbiomed.2020.09.023. [DOI] [PubMed]
  • 44.Sekine H, Okazaki K, Kato K, Alam MM, Shima H, Katsuoka F, Tsujita T, Suzuki N, Kobayashi A, Igarashi K, Yamamoto M, Motohashi H. 2018. O-GlcNAcylation signal mediates proteasome inhibitor resistance in cancer cells by stabilizing NRF1. Mol Cell Biol 38:e00252-18. doi: 10.1128/MCB.00252-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Katsuoka F, Yokozawa J, Tsuda K, Ito S, Pan X, Nagasaki M, Yasuda J, Yamamoto M. 2014. An efficient quantitation method of next-generation sequencing libraries by using MiSeq sequencer. Anal Biochem 466:27–29. doi: 10.1016/j.ab.2014.08.015. [DOI] [PubMed] [Google Scholar]
  • 46.Huang DW, Sherman BT, Lempicki RA. 2009. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44–57. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
  • 47.Hirotsu Y, Katsuoka F, Funayama R, Nagashima T, Nishida Y, Nakayama K, Engel JD, Yamamoto M. 2012. Nrf2-MafG heterodimers contribute globally to antioxidant and metabolic networks. Nucleic Acids Res 40:10228–10239. doi: 10.1093/nar/gks827. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Ramírez F, Dündar F, Diehl S, Grüning BA, Manke T. 2014. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res 42:W187–W191. doi: 10.1093/nar/gku365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Ma W, Noble WS, Bailey TL. 2014. Motif-based analysis of large nucleotide data sets using MEME-ChIP. Nat Protoc 9:1428–1450. doi: 10.1038/nprot.2014.083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Grant CE, Bailey TL, Noble WS. 2011. FIMO: scanning for occurrences of a given motif. Bioinformatics 27:1017–1018. doi: 10.1093/bioinformatics/btr064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res 14:1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental file 1

Table S1. Download mcb.00520-21-s0001.xlsx, XLSX file, 0.06 MB (65.7KB, xlsx)

Data Availability Statement

All sequencing data are present in the NCBI GEO Superseries under accession no. GSE188788.


Articles from Molecular and Cellular Biology are provided here courtesy of Taylor & Francis

RESOURCES