Skip to main content
Molecular Endocrinology logoLink to Molecular Endocrinology
. 2012 Jun 28;26(10):1646–1650. doi: 10.1210/me.2012-1114

Minireview: Nuclear Receptor and Coregulator Proteomics—2012 and Beyond

Bert W O'Malley 1,, Anna Malovannaya 1, Jun Qin 1
PMCID: PMC3458220  PMID: 22745194

Abstract

The focus of our decade-long National Institutes of Health-sponsored NURSA Proteomics Atlas was to catalog and understand the composition of the steady-state interactome for all nuclear receptor coregulator complexes in a human cell. In this Perspective, we present a summary of the proteomics of coregulator complexes with examples of how one might use the NURSA data for future exploitation. The application of this information to the identification of the coregulator proteins that contribute to the molecular basis of polygenic diseases is emphasized.


In the past decade, it has become clear that gene-regulatory functions of all transcription factors depend upon the recruitment of a large number of coregulators, a class of proteins composed of coactivators and corepressors (16). Coactivators are used by nuclear receptors (NR), and other transcription factors for gene induction and corepressors are used for gene silencing. From the perspective of activation of target genes, a NR first receives an activating ligand signal (e.g. a hormone) and locates at specific nuclear DNA sites where it binds to the enhancer/promoter regions of a target gene; the receptor then recruits a series of coactivator complexes that carry out the necessary biochemical subreactions that are required for transcription. This is a dynamic process that works in conjunction with epigenomic nucleosomal marks, chromatin remodeling, and cell signaling reactions (7).

Modular coactivator complexes consist commonly of approximately 10 protein partners, either stably or interchangeably associated with each other. Each of these coactivator molecules contributes a unique function to the parent complex. The functions often are enzymatic because gene expression requires multiple enzymatic reactions to modify histones, rearrange nucleosomes, initiate transcription, perform elongation and splicing of RNA, and finally, to degrade the active regulatory proteins in situ during gene transcription (8). In fact, current protein biochemistry suggests that virtually all signals are transduced within a pathway, not by the binary interactions between two proteins, but by interactions among separate multimeric protein complexes. This conclusion suggests that it is imperative for cell and molecular biologists to understand the members of such complexes and the reasons behind their inclusions into one functional unit.

The focus of our decade-long National Institutes of Health-sponsored NURSA Proteomics Atlas was to catalog and understand the composition of the steady-state interactome for all NR coregulator complexes in a human cell. A wealth of data on coregulator complexes now has been accumulated (9, 10). In this project, immunoprecipitations of known coregulators, NR, signaling molecules, and regulators of DNA damage used approximately 2000 antibodies to produce over 3500 experiments for more than 2800 annotated human regulatory protein complexes. The outcome of this large-scale effort led to identification of about 11,500 unique human gene products, representing approximately 50% of our genome (10). The latter calculation was perhaps one of the most surprising outcomes of the project, because it suggests that in contrast to original estimates, up to 50% of our genome is used to code for proteins that directly or indirectly control the expression of the genome itself. Our results produced a database that reinforced the impressive role that protein complexes play in the mammalian cell and emphasized that cell regulation is carried out by groups of proteins acting in concert as functional units. We defined the initial building blocks for protein coregulator complexes (minimal endogenous modules, or MEMO), the stable functional units (unique core complexes, or uniCORE), and the complex-complex interactions (CCI) that occur for more advanced pathway communication (Fig. 1). We also divided the CCI into two broad classes, termed Type I and II. In Type I networks, the main core complex modules remain fairly constant when used in various pathways, which often manifests as a dense preferential network of multisubunit cores; examples are the Nucleosome Remodeling and histone Deacetylation complex and the Mediator complex of coregulators. In contrast, some platform coactivators such as steroid receptor coactivator (SRC)-3 form so many transient connections for their diverse functions that they cannot be assigned a standard set of interactive co-coactivators; these coregulator complexes were designated as belonging to the Type II class (Fig. 1).

Fig. 1.

Fig. 1.

Protein-protein interactions constitute the molecular backbone of cell biology, where select proteins assemble into meta-stable complexes to form bioactive units. The intrinsic tiered organization of the interactome can be represented in three discrete layers. These are 1) the obligatory MEMO complex modules; 2) the uniCORE isoforms; and 3) the transient CCI networks that likely represent the backbone of regulatory biology. Generally, interaction networks of transcriptional coregulators display two patterns: Type I CCI have stable multisubunit cores and highly preferential network partners, whereas Type II coregulators form multitudinous protein complex associations, as exemplified by the CCI network of estrogen receptor coactivator SRC-3 (NCOA3). IP, Immunoprecipitation; Pol II, polymerase II.

A prime question now relates to how individual investigators can utilize this wealth of information to form new hypothesis-generating research and to understand the roles of individual components of these coregulator complexes in normal physiology. A way of approaching these data now exists (see www.NURSA.org), but what is needed from the scientific community are more proof-of-principle studies to demonstrate the ways in which users of this database can orient their questions toward an even greater understanding of physiology and disease. Logic tells us that if a coactivator complex is formed to achieve some biological process in a cell, then each protein within the complex makes a fractional contribution to the function of the complex as a whole. Otherwise, why would the protein components have coevolved as members of the complex?

SRC-2 and Retinoid-Related Orphan Receptor (ROR)α and Glucose Metabolism

As a first example, let us consider the release of glucose from the liver, a critical event for metabolic homeostasis. Release of preformed hepatic glucose requires an enzyme called “glucose-6-phosphatase” (G6Pase), which dephosphorylates glucose-6-phosphate, allowing it to pass out of the cell via the cell-secretory apparatus. When this gene is genetically damaged, patients are faced with a life-threatening condition when deprived of food for short periods—a syndrome described as von Gierke's disease, Type 2. Animals with a deleted G6Pase effecter gene will mimic the human syndrome. Recently, however, we discovered another potential cause of this syndrome in mice: deletion of the SRC-2 coactivator gene. We found that the SRC-2 coactivates RORα, which in turn, activates the G6Pase gene (11). Consequently, we now know there are three potential ways to produce a von Gierke's phenotype: 1) a defect in the G6Pase gene itself; 2) a defect in the RORα transcription factor; and 3) a defect in the SRC-2 coactivator. In fact, because the SRC-2 complex is made up of multiple coactivator proteins, defects in any one or a combination of multiples of these regulatory proteins could produce an abnormality in glucose release. This thought process leads us away from a strict consideration of only the integrity of the G6Pase gene, and makes us consider the interplay of the multiple other genes that code for the coactivators and transcriptional proteins that play a role in the gene's expression.

A typical future scenario for exploiting the NURSA proteomic database to identify cooperating partners in a coregulator complex and to relate the complex to downstream functional metabolomics might be as follows: 1) a scientist is interested in a particular metabolic gene in liver (or an oncogene) and finds that the expression of the gene is altered in patients (or mice) producing a pathological metabolic abnormality; 2) he/she enters the NURSA proteomic database and sees that the human gene product is found in a multiprotein complex that was identified in our large-scale proteomics profiling; 3) because the primary data in the database are embedded in generic data, the user immediately sees that two of the other protein components in the parent complex also have been reported to be altered in a similar metabolic disease; 4) the user now can use the Transcriptomine (a NURSA web transcriptomic/microarray track) tool kit to find out if any of those three genes present in the protein complex have altered expression levels in pathophysiological liver samples; and 5) finally, the investigator can move to a metabolomics database, which will be a future part of NURSA, to search for alterations in intermediary metabolism or metabolites that may have been associated with one or more of the parent genes producing any of the mutant proteins in the coactivator complex. Adjunct databases such as Oncomine may be useful for certain cancer-related applications. Taken together, the range of data in our databases would provide an integrated picture of the transcriptional pathway and indicate a final metabolic pathway outcome.

To further test the function of the complex, the investigator can utilize traditional methods to knock down individual components of the complex and monitor the resultant effects on hepatic metabolism. Mouse models of disease would be especially relevant for early validation studies. Using this scenario, the investigator then will have garnered sufficient information to infer that because of the polygenic input from the multiple genes coding for the proteins in the coactivator complex, there likely is a polygenic basis to this metabolic disease. Thus, the user of the database will have identified the complete potential of the polygenic input by knowing the components in the functional coactivator complex.

Such an integrated genomics-proteomics platform brings together data collected by different OMICS technologies, so that one can accurately assign functions to previously unknown genes that produce components of coactivator networks; this will allow further assessment of their resultant associated RNA transcripts and predict the metabolic outcomes from disturbances in all of the parent genes for the proteins in the complexes. Informatics, of course, would play an obligate role in this scenario. NURSA has an informatics team that is assembling this information to provide a means for integrating data from various “omics” platforms. Two published examples of this process in action are described below.

How does modern information concerning the components of a coactivator complex alter our appreciation of the contributing roles of coactivator defects to polygenic diseases? There is a wealth of data that demonstrate that a given coactivator complex regulates not just NR, but also, a number of other different transcription factors and their subsets of target genes and that these genes are part of a grouping that regulates the complete requisite transcriptional output to produce some physiological goal such as metabolism, reproduction, or growth. Well, if we again consider that the functional coactivator complexes are made up of at least about 10 gene products, then we must conclude that all of these genes play a role in the function of the coactivator complex in question. If we accept the concept that coactivators are designed to be master genes that regulate larger physiological goals in cells, we realize that members of the coactivator complex have the ability to influence the intended physiological goals in a polygenic manner. Therefore, changes in concentrations of individual (or multiple) coactivator proteins (or allelisms) can influence the output function of the complex as a whole and influence all of the downstream target genes that work to complete the physiological process. Deficiencies in this process are especially sensitive to conditions in which the organism is put under stress of some type.

Members of A Sin3B Tumor Repressor Complex

In another example from a recent publication from our laboratories (10), the transcriptional repressor protein Sin3B was immunoprecipitated, and its associated proteins were determined by mass spectrometry. Sin3B is a recognized oncogenic protein. Interestingly, two other proteins known to be oncogenic drivers in breast cancers (EMSY and KDM5A) also were detected in the Sin3B complex (10). A simple deduction follows from our observation. If the downstream function of the Sin3B complex is repression of oncogenesis, then each protein in the complex is highly likely to contribute to some fraction of the oncogenic repressive output of the complex as a whole. Therefore, we can hypothesize that the other identified proteins in the SIN3B complex at its immediate interaction network are likely tumor suppressors and are prime candidates for further biological testing. Moreover, if testing substantiates that conclusion, an investigator now will have identified, in one fell-swoop, a group of proteins functioning together as a tumor repressor complex. Moreover, we would realize that these diverse and geographically disparate gene products work in the same pathways and on the same targets to effect gene repression in cancer transformation and/or metastasis.

It should be expected that the unity of function would be strongest at the level of MEMO (10). A MEMO represents the basic core module of proteins that form a functional transcriptional complex, minus the next level of stable complex isoforms (uniCORE) and the CCI proteins. The interaction heterogeneities in uniCORE and CCI are modifiers of MEMO functions and often represent other interacting complexes that bring signals to the MEMO. Focusing initially on the MEMO, thus, provides us more specificity and less complexity for our first approximation of the cooperating genes.

This approach allows us to link genomics with proteomics. For example, if we can identify one marker gene by deep sequencing of the genomes of patients with a disease, then we can revert to proteomics mass spectrometry to find the associated proteins in its functional complex. In turn, when we look back into DNA-sequencing banks from tumors, we can determine whether proteins in the functional complex are represented as genetic mutations/single nucleotide polymorphism in the genome. This complementation of information via protein complex composition provides a way of immediately linking disparate genomic sequence changes that otherwise would have no apparent association until huge numbers of tumors are sequenced. An approach using genomic DNA sequencing alone to group potential functional polygenes could otherwise require sequencing the genomes of up to 100,000 different tumors of patients to achieve accurate statistical linkage of complicated diseases.

Metastasis-Associated Coregulator Protein 1 (MTA1) and DJ1 in Neurological Dysfunction

For a final example, let us consider another recent publication that emanated from information provided by the NURSA database (12). Despite numerous studies of the oncogenic MTA1, the extended physiological roles of the MTA1 coactivator beyond cancer were unknown. From the NURSA proteomic database, it was noted that MTA1 was found to be associated in a coregulator complex with DJ1 (or PARK7). The DJ1/PARK7 gene has been associated previously with patients with Parkinson's disease. This result led to consideration of a potential neurological function for MTA1. We next conducted collaborative experiments that demonstrated that MTA1 is a bona fide coactivator and stimulator of tyrosine hydroxylase (TH) transcription in neuronal cells (12). Furthermore, MTA1-null mice were noted to have a movement disorder and a lower TH expression and lower dopamine concentrations in the striatum and substantia nigra (12). We substantiated that MTA1 physically achieves these functions by directly interacting with DJ1 and, in turn, recruiting the DJ1-MTA1-RNA polymerase II complex to the bicoid-binding element in the DNA of the TH promoter. Experiments then were done to show that the MTA1/DJ1 coactivator complex requires the Pitx3 homeodomain transcription factor to recruit it to the enhancer element of the TH gene. These findings revealed a previously unrecognized role for MTA1 as an upstream coactivator of TH and further advanced the notion of polygenic regulation of a disease-causing gene through the coordinated interactions of at least these three regulatory proteins. The function of TH directly relates back to the polygenic input of the multiple driver genes coding for the coregulator proteins in the regulatory coactivator complex responsible for expression of the TH gene. In this case, these three proteins, and others in the coactivator complex, provide polygenic input to dopamine production.

In summary, the use of proteomic data indicates a new way of thinking about polygenic diseases. We often consider such pathologies to be due to a combination of defective endpoint structural genes or additive/synergizing pathways that must be identified by genomic sequencing. The results of sequencing frequently reveal a widely dispersed grouping of mutations in the genome that present with no apparent functional linkage. In contrast, with the newly available proteomic databases, we now can consider whether such genes work together to code for functional members of coactivator/corepressor complexes that are required to activate/repress disease gene sets. Furthermore, stratification of the protein interactome into distinct modules is critical for prioritization of follow-up targets. Without appreciation for interaction preference, or interaction network “topology” as we might call it, guilt-by-association approach could be easily extended into many hundreds of interacting partners, rendering interaction information ineffective. With references such as our CCI resource, the use of affinity-based proteomics to directly attack a mechanism of polygenic disease is now possible and should prove much more effective than conventional protein-protein interaction knowledge. Technological advancements in proteomics fit well with the genomic era, allowing direct linkage of the gene sequences with their functional output mediators, the coregulator proteins. Further popularization of such an approach will depend on additional advancements in informatics and a continued effort to develop convenient web-based resources for mining of the new proteomic data. Such informatics advances will be key to widespread investigator usage.

Acknowledgments

This work was supported by The National Institute of Diabetes and Digestive Kidney Diseases, NIH (Grant 2U19 DK062434) and The National Institute for Child Health and Human Development, NIH (Grant HD 2R01 HD08188).

Disclosure Summary: The authors all certify that there is nothing to disclose.

Footnotes

Abbreviations:
CCI
Complex-complex interaction
G6Pase
glucose-6-phosphatase
MEMO
minimal endogenous module
MTA1
metastasis-associated coregulator protein 1
NR
nuclear receptor
ROR
retinoid-related orphan receptor
SRC
steroid receptor coactivator
TH
tyrosine hydroxylase
uniCORE
unique core complex.

References

  • 1. McDonnell DP, Vegeto E, O'Malley BW. 1992. Identification of a negative regulatory function for steroid receptors. Proc Natl Acad Sci USA 89:10563–10567 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Vegeto E, Allan GF, Schrader WT, Tsai MJ, McDonnell DP, O'Malley BW. 1992. The mechanism of RU486 antagonism is dependent on the conformation of the carboxy-terminal tail of the human progesterone receptor. Cell 69:703–713 [DOI] [PubMed] [Google Scholar]
  • 3. Baniahmad A, Leng X, Burris TP, Tsai SY, Tsai MJ, O'Malley BW. 1995. The τ 4 activation domain of the thyroid hormone receptor is required for release of a putative corepressor (s) necessary for transcriptional silencing. Mol Cell Biol 15:76–86 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Oñate SA, Tsai SY, Tsai MJ, O'Malley BW. 1995. Sequence and characterization of a coactivator for the steroid hormone receptor superfamily. Science 270:1354–1357 [DOI] [PubMed] [Google Scholar]
  • 5. Xu J, Qiu Y, DeMayo FJ, Tsai SY, Tsai MJ, O'Malley BW. 1998. Partial hormone resistance in mice with disruption of the steroid receptor coactivator-1 (SRC-1) gene. Science 279:1922–1925 [DOI] [PubMed] [Google Scholar]
  • 6. Lonard DM, Lanz RB, O'Malley BW. 2007. Nuclear receptor coregulators and human disease. Endocr Rev 28:575–587 [DOI] [PubMed] [Google Scholar]
  • 7. Mandrup S, Hager GL. 2012. Modulation of chromatin access during adipocyte differentiation. Nucleus 3:12–15 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Wu RC, Feng Q, Lonard DM, O'Malley BW. 2007. SRC-3 coactivator functional lifetime is regulated by a phospho-dependent ubiquitin time clock. Cell 129:1125–1140 [DOI] [PubMed] [Google Scholar]
  • 9. Malovannaya A, Li Y, Bulynko Y, Jung SY, Wang Y, Lanz RB, O'Malley BW, Qin J. 2010. Streamlined analysis schema for high-throughput identification of endogenous protein complexes. Proc Natl Acad Sci USA 107:2431–2436 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Malovannaya A, Lanz RB, Jung SY, Bulynko Y, Le NT, Chan DW, Ding C, Shi Y, Yucer N, Krenciute G, Kim BJ, Li C, Chen R, Li W, Wang Y, O'Malley BW, Qin J. 2011. Analysis of the human coregulator complexome. Cell 145:787–799 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Chopra AR, Louet JF, Saha P, An J, Demayo F, Xu J, York B, Karpen S, Finegold M, Moore D, Chan L, Newgard CB, O'Malley BW. 2008. Absence of the SRC-2 coactivator results in a glycogenopathy resembling Von Gierke's disease. Science 322:1395–1399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Reddy SD, Rayala SK, Ohshiro K, Pakala SB, Kobori N, Dash P, Yun S, Qin J, O'Malley BW, Kumar R. 2011. Multiple coregulatory control of tyrosine hydroxylase gene transcription. Proc Natl Acad Sci USA 108:4200–4205 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Molecular Endocrinology are provided here courtesy of The Endocrine Society

RESOURCES