Abstract
TBP-associated factor 4 (TAF4), an essential subunit of the TFIID complex acts as a coactivator for multiple transcriptional regulators, including Sp1 and CREB. However, little is known regarding the structural properties of the TAF4 subunit that lead to the coactivator function. Here, we report the crystal structure at 2.0-Å resolution of the human TAF4-TAFH domain, a conserved domain among all metazoan TAF4, TAF4b, and ETO family members. The hTAF4-TAFH structure adopts a completely helical fold with a large hydrophobic groove that forms a binding surface for TAF4 interacting factors. Using peptide phage display, we have characterized the binding preference of the hTAF4-TAFH domain for a hydrophobic motif, DΨΨζζΨΦ, that is present in a number of nuclear factors, including several important transcriptional regulators with roles in activating, repressing, and modulating posttranslational modifications. A comparison of the hTAF4-TAFH structure with the homologous ETO-TAFH domain reveals several critical residues important for hTAF4-TAFH target specificity and suggests that TAF4 has evolved in response to the increased transcriptional complexity of metazoans.
Keywords: TFIID, transcription, x-ray crystallography, TAFH domain
The general transcription factor TFIID, composed of the TATA-binding protein (TBP) and at least 14 additional TBP-associated factors (TAFs) (1), plays an important role in the regulation of gene transcription by RNA polymerase II. It contributes to a large number of activities necessary for regulated transcription, such as core promoter recognition and chromatin modification and recognition (2, 3). Individual TAFs are important for mediating distinct activator-specific transcriptional regulation in vivo (4). Studies of yeast strains having temperature-sensitive mutations indicate that ≈84% of yeast genes depend on one or more TAFs (5), indicating the importance of this factor in RNA Pol II transcription.
The human TAF4 (hTAFII130/dTAFII110) subunit of the TFIID complex was the first of the TAF subunits demonstrated to possess coactivator activity for the glutamine-rich activators Sp1 (6), CREB (7), and nuclear receptors RAR and TR (8). More recently, TAF4 has been demonstrated to play a critical role in maintaining the stability of the TFIID complex (9). The hTAF4 sequence contains four glutamine-rich tracts mediating interactions with the activators Sp1 and CREB (7) and two highly conserved domains CI and CII [supporting information (SI) Fig. 5A]. The misregulation of Sp1 activation mediated by hTAF4 has been implicated in Huntington's disease and a number of other neurodegenerative diseases that result from polyglutamine expansions (10). A region (residues 870–911) of the CII domain has been shown to interact with the histone-fold motif of hTAF12 and form a novel histone-like pair (11). The CI region of hTAF4 is highly conserved among all metazoan TAF4, TAF4b, and ETO family members and is also known as the TAF homology (TAFH) domain (6, 12) (SI Fig. 5B). The ETO-TAFH domain has been demonstrated to mediate the repression of E-box-regulated genes through its interaction with the activation domain AD1of E proteins (13). However, to date no function has been attributed to the TAFH domain in hTAF4 (hTAF4-TAFH).
Here, we report the crystal structure and functional characterization of the hTAF4-TAFH domain. Our structure at 2.0-Å resolution reveals that the hTAF4-TAFH domain adopts a completely helical fold closely related to the recently determined structure of the ETO-TAFH domain (14) with a large hydrophobic groove that is used by TAF4 for protein–protein interactions. Our biochemical and structural analysis reveals that the hTAF4-TAFH domain has distinct but related sequence specificity from that of the ETO-TAFH domain because of the rearrangement of hydrophobic residues forming the binding surface. Results from a peptide phage display screen identified a previously undescribed hydrophobic binding motif, DΨΨζζΨΦ, for the hTAF4-TAFH domain. This signature sequence is present in a number of transcriptional regulators that may be potential targets of the hTAF4-TAFH domain in vivo.
Results
Three-Dimensional Structure of the hTAF4-TAFH Domain.
A structural domain corresponding to the CI region (also known as the TAFH domain) between residues 592 and 675 was identified by limited proteolysis and liquid chromatography MS analysis (data not shown). Subsequent subcloning identified a fragment corresponding to residues 575–688 that was highly soluble and adopted a completely α-helical structure (henceforth designated hTAF4-TAFH). A chemically modified version of the hTAF4-TAFH domain that contained dimethylated lysine residues was crystallized in space group P6322 with a single copy of the hTAF4-TAFH molecule per asymmetric unit. The structure was determined at 2.0-Å resolution by applying the multiwavelength anomalous dispersion (MAD) method. Clear density was present for all amino acids between residues 583 and 679 that included all residues contained within the TAFH domain. Residues 575–582 and 680–688 are not present in the electron density map. The current refined model consists of 97 residues, an SO4− ion plus solvent, and has an R-factor of 21.4% and an Rfree value of 24.8%. Statistics of the crystallographic data, analysis, and refinement are summarized in SI Table 4.
The hTAF4-TAFH domain structure is made up of five α-helices (Fig. 1A) that together form a five-helix “wedge” in which a more commonly observed four-helix bundle (helices 1, 2, 3, and 5) is perturbed by the unusual orthogonal orientation of helix 4. A large open groove that is completely devoid of hydrogen bond acceptors or donors (Fig. 1B) is present between helices 1 and 4. This groove runs across the entire length of one face of the hTAF4-TAFH domain with a size of ≈23 Å × 9 Å (length × width, respectively) in a slightly curved path. The groove is large enough to accommodate a peptide with alpha-helical secondary structure and is formed by hydrophobic residues from helices 1, 2, 3, and 4 (Fig. 1C). Although helix 5 does not directly participate in the formation of the hydrophobic groove, it plays an important role in the overall packing of the hydrophobic core.
The hydrophobic cleft of the hTAF4-TAFH domain forms the putative interaction surface and can be thought of as being formed from three main components: pockets 1 and 2, and an aromatic region (Fig. 1 B and C). Residues L603 and L645 lay at the junction of pockets 1 and 2, yielding a continuous hydrophobic surface. The surface of the groove in the aromatic region becomes quite flat and is primarily formed by the exposure of two conserved phenylalanine residues (F599 and F656). Another key feature is the presence of a conserved lysine residue (K595) at the end of the aromatic region (Figs. 1C and 2B) that plays an important role in the recognition of ligands for the hTAF4-TAFH domain. Most of the residues forming the groove are highly conserved among the TAFH domains found in the TAF4, TAF4b, and ETO family members (highlighted in red in SI Fig. 5B).
Recently, the NMR structure of ETO-TAFH domain was determined, and a similar hydrophobic groove was identified as the binding surface for ETO targets that contain LxxLL motifs (14). We have compared the three-dimensional structures of the two TAFH domains by superimposing ETO-TAFH (Protein Data Bank ID code 2H7B) onto our newly determined hTAF4-TAFH domain structure. The overlay, based on superposition of 22 identical residues, yields an rms deviation of ≈2.0 Å for 63 Cα atoms, showing that the TAFH structure is very well conserved (Fig. 1 D and E). Despite their relatively high sequence similarity, the features of the binding groove are significantly different between these two TAFH domains. The binding groove of ETO-TAFH is significantly shorter (≈17 Å in length, as opposed to 23 Å in hTAF4) and is much less flat than that of the hTAF4-TAFH domain (Fig. 1 B and E).
The most striking difference between the two TAFH domains is the presence of an additional short, fifth α-helix in hTAF4-TAFH that facilitates the changes to the binding cleft. The presence of this helix results in changes to the orientations of two key conserved phenylalanine residues (F599 and F656 in hTAF4) through rearrangements to the hydrophobic core (Fig. 2 A and B). Among all vertebrate TAF4 family members a proline residue (P669) is present, but this residue is not conserved in the ETO family member sequences (SI Fig. 5B). P669 breaks helix 4 of hTAF4-TAFH and leads to the formation of the TAF4-specific fifth α-helix (Fig. 2A). Where the fifth helix packs against the face of the hTAF4-TAFH domain, a vertebrate TAF4-family-specific phenylalanine residue (F674) is buried in the hydrophobic core and inserts between I604 and L664. The insertion of F674 pushes I604 and L664 apart and displaces the C-terminal portion of the fourth helix ≈3 Å away from the binding cleft (bottom of Fig. 2A). The increased distance between helices 1 and 4 provides space for the movement of another TAF4-family-specific residue, C596, away from the binding surface downward (Fig. 2B). The repacking of the hydrophobic core results in the large downward shifts (≈3.5 Å) of phenylalanine residues (F599 and F656) in the binding cleft, ultimately leading to the formation of a more extended and flat recognition surface in the aromatic region of the hTAF4-TAFH binding groove.
A second important difference between the hTAF4 and ETO-TAFH domains is a change to the connection between helices 1 and 2 that results from the deletion of two residues in the ETO-TAFH sequence. In addition, there are changes of identity in residues surrounding the binding groove. These changes cause the rearrangement of residues in the connecting sequence and shift the connection toward the inner portion of the ETO-TAFH binding groove. A more shallow pocket 1 and a smaller pocket 2 are produced in the ETO-TAFH domain than those present in the hTAF4-TAFH domain.
The differences observed in the aromatic region and pockets 1 and 2 are likely to form the main determinants resulting in the differences in binding specificity of TAFH domain from the ETO and TAF4 families.
Specificity of the hTAF4-TAFH Domain.
The ETO-TAFH domain had been demonstrated to interact with an LxxLL motif in the first activation domain (AD1) of E proteins such as E47, E2A, or HEB (13). Based on the similarities between hTAF4-TAFH and ETO-TAFH domains, we first investigated whether the hTAF4-TAFH domain could also recognize the AD1 from the E-box proteins. GST pull-down assays were carried out, but the AD1 region could not be shown to interact stably with the hTAF4-TAFH domain by using this assay (data not shown). In parallel, we synthesized a 21-mer peptide corresponding to the AD1 region of E2A and measured its affinity for hTAF4-TAFH directly by isothermal titration calorimetry (ITC). This peptide was found to bind to hTAF4-TAFH weakly (Kd = 140.2 ± 4.0 μM) with 1:1 stoichiometry in a specific manner (Table 1 and SI Fig. 6A).
Table 1.
Sequence source | Peptide sequence | Kd, μM | n |
---|---|---|---|
E-box activation domain | |||
E2A (7–27) - AD1 [P15923] | MAPVGTDKELSDLLDFSMMFP | 140.2 ± 4.0 | 0.802 ± 0.120 |
Phage display sequences | |||
Phage Seq 7 | RWGGSDLLQTILLSAGLSGGR | 5.6 ± 0.2 | 0.945 ± 0.059 |
Phage Seq 7* | RWGGSDLLQTLLLSAGLSGGR | 15.3 ± 0.1 | 0.885 ± 0.072 |
Phage Seq 8 | RGGSSDILQTAWSVAFSGGR | 10.0 | 1.015 |
Phage Seq 1 | EWRLLHTLFPPPGGG | 319.2 ± 5.6 | 0.816 ± 0.113 |
Phage Seq 5 | RGGSNPHMLWALFPPASGGR | 606.6 ± 6.8 | N.D. |
Candidate sequences | |||
Zhangfei (ZF) (71–90) [Q9NS37] | YSAAEMQRFSDLLQRLLNGIGGSS | 5.2 ± 2.0 | 0.810 ± 0.127 |
LZIP (46–63) [AAB84166] | SDWEVDDLLCSLLSPPAS | 41.0 ± 5.3 | 1.003 ± 0.134 |
E2F4 (366–382) [Q16254] | MSSELLEELMSSEVYAP | 275.5 | 0.907 |
Huntingtin (1311–1328) [NP_002102] | TVCVQQLLKTLFGTNLAS | >300 | N.D. |
HIRA (764–781) [CAG30389] | YEGGRRLLSPILLPSPISTLHSGE | >300 | N.D. |
hSet1 (1283–1305) [O15047] | SDEAERPRPLLSHILLEHNYALA | >300 | N.D. |
PNUTS (81–100) [NP_002705] | YSKTTNNIPLLQQILLTLQHSSGE | >300 | N.D. |
HDAC9 (389–409) [AAK66821] | YNSSHQALLQHLLLKEQMRQQK | >300 | N.D. |
PGC1α (136–155) [NP_037393] | YQEAEEPSLLKKLLLAPANTQ | >360 | N.D. |
A spacer sequence G-G-G or S-G-G-R, indicated in italics, was added to the C terminus of phage peptides 1, 5, 7, 7*, and 8. Other residues shown in italics were added to increase the solubility of peptides and to allow determination of the peptide concentration by UV absorbance spectroscopy. N.D., not detectable.
Based on the crystal structure of the hTAF4-TAFH domain and the structural similarity to the ETO-TAFH domain, which had been shown to be specific for the LxxLL motif, we decided to use peptide phage display to attempt to isolate binding sequences that could be subsequently used as a guide to identify potential TAF4 binding partners. A peptide phage display library randomized at 12 positions was used to probe interactions with untagged hTAF4-TAFH. Given that the hTAF4-TAFH domain could interact weakly with AD1 peptides from E2A (Table 1), this peptide was used to elute specifically bound phage during panning. After the five rounds of panning, the eluted phage were enriched ≈230 times (from 1.39 × 105 to 3.23 × 107 pfu/μl). The peptide coding regions from 44 phage plaques were sequenced and the translated amino acid sequences were aligned (Table 2). Strikingly, a majority of the selected sequences contained a high proportion of hydrophobic residues in a pattern reminiscent of the ETO-TAFH-specific sequence (LxxLL). However, the hTAF4-TAFH domain displayed a preference for an expanded motif with four hydrophobic residues (ΦΦxxΦΦ, where Φ represents a hydrophobic residue and x represents any residue). Later, it became apparent that in addition to the hydrophobic motif, the hTAF4-TAFH domain had an additional requirement for aspartic acid at the 0 position.
Table 2.
Sequence family | Selected sequence | Frequency of occurrence |
---|---|---|
ΦΦxxΦΦ motif | ||
1 | EWRLLHTLFPPP | 18 |
2 | ERALLHTLFPPD | 6 |
3 | SPHLLQALFPPP | 5 |
4 | EDLLAKAWNIHF | 3 |
5 | NPHMLWALFPPA | 2 |
6 | DIVSRAWTIVMR | 2 |
7 | DLLQTILLSAGL | 1 |
8 | SDILQTAWSVAF | 1 |
9 | DLLAEAMQTARI | 1 |
10 | YEDPISWLHEMF | 1 |
11 | HPNLLRALFPSP | 1 |
No apparent motif | ||
12 | QISRWQPQQILV | 1 |
13 | NTVYAASWSRPQ | 1 |
14 | ISAKAQARTIIP | 1 |
To help characterize the relative affinities of the selected sequences, phage ELISA was carried out, and three sequences EWRLLHTLFPPP (phage 1), DLLQTILLSAGL (phage 7), and SDILQTAWSVAF (phage 8) were found to exhibit significantly higher affinities than other selected phage (data not shown). Synthetic peptides for several isolated sequences were generated for ITC measurements to characterize the best candidates (Table 1). Phage sequence 7 (DLLQTILLSAGL, Kd = 5.6 ± 0.2 μM) (SI Fig. 6B) and sequence 8 (SDILQTAWSVAF, Kd = 10.0 μM) were identified as the best binding sequences from the selected pool. We also tested a peptide identical to phage sequence 7 (DLLQTILLSAGL) with a single substitution of leucine for isoleucine, labeled in italic in the sequence (named as phage sequence 7*: DLLQTLLLSAGL), and measured its binding ability with ITC (Table 1). This peptide bound to the hTAF4-TAFH domain with a slightly reduced affinity of 15.3 ± 0.1 μM.
Candidate Interaction Partners for the hTAF4-TAFH Domain.
Based on the binding sequences of the hTAF4-TAFH domain that were selected with peptide phage display and the similarities between binding grooves of TAFH domains from ETO and hTAF4, we searched the Swiss-Prot/TrEMBL protein databases for sequences from human proteins containing either LLxxL(L/M/F) or LLxx(I/L)LL motifs using ScanProsite (15) and chose nine peptide sequences as potential targets of the hTAF4-TAFH domain (Table 1). These proteins either had previously been implicated in TAF4-dependent transcription, for example, huntingtin (10), or functionally associated with transcription regulation, such as sequence-specific transcriptional regulators, Zhangfei (ZF), LZIP, and E2F4; transcriptional coactivator, PGC1α; histone modifiers, hSet1, PNUTS, and HDAC9; and histone chaperone, HIRA.
We synthesized peptides corresponding to hydrophobic motifs from these candidates and measured their binding affinities by ITC (Table 1). The peptide taken from ZF bound very tightly at a stoichiometry near 1:1 and with an affinity of 5.2 ± 2.0 μM, which was as high as that of the best phage display-isolated sequence 7 (Kd = 5.6 ± 0.2 μM). The binding affinity of a peptide from the transcriptional activator LZIP was found to be 41.0 ± 5.3 μM. Surprisingly, all of the other candidate peptides tested bound very weakly, all with Kd values >275.5 μM.
We had believed the hydrophobic motif to be the primary determinant driving interactions between the peptides and the hTAF4-TAFH domain. The lack of observed binding for many peptides that contained hydrophobic residues with appropriate spacing and character led us to reexamine all of the sequences that we had tested by ITC. All sequences that bound well had an aspartic acid residue at the 0 position relative to the hydrophobic motif. Interestingly, the LZIP peptide showed a 2-fold decrease in the binding affinity for a dimethylated-lysine variant of hTAF4-TAFH, suggesting a potential role of a lysine in target recognition. Next to the TAF4-TAFH hydrophobic groove, there is a highly conserved lysine (K595) (Fig. 1C). Interactions between the aspartic acid at the 0 position of hTAF4-TAFH binding motif and K595 next to the binding cleft may enhance the ligand specificity and be required for interaction.
To confirm that the hydrophobic groove composed of pockets 1 and 2, and the aromatic region (Fig. 1B) was the interaction surface of the hTAF4-TAFH domain, several conserved hydrophobic residues within the groove were mutated and the effects of mutations on the binding affinity between hTAF4-TAFH and ZF or LZIP peptides were investigated with ITC (Table 3). Mutations of F599, L603, V620, or F656 into charged residues (R or E) all reduced binding affinities ≈2- to 5-fold compared with the native hTAF4-TAFH domain, showing that the hydrophobic groove was the specific binding surface of hTAF4-TAFH domain.
Table 3.
Peptide | Protein | Kd, μM | n |
---|---|---|---|
Zhangfei (71–90) [Q9NS37] | hTAF4–TAFH | 5.2 ± 2.0 | 0.810 ± 0.127 |
F599R mutant | 19.9 | 0.917 | |
L603E mutant | 11.0 | 0.617 | |
V620R mutant | 10.1 | 0.667 | |
F656R mutant | 13.1 | 0.713 | |
LZIP (46–63) [AAB84166] | hTAF4–TAFH | 41.0 ± 5.3 | 1.003 ± 0.134 |
F656R mutant | 130.8 | 1.148 |
The sequences of ZF and LZIP peptides are listed in Table 1.
Combining the information obtained from the peptide phage display and ITC measurements, we have identified the binding preference of the hTAF4-TAFH domain for short peptide sequences containing the motif DΨΨζζΨΦ (sites 0–6), where Ψ represents V, I, L, or M; ζ represents hydrophilic residues including N, Q, S, or T; and Φ represents V, I, L, F, W, Y, or M (16). When plotted on a helical wheel, this motif forms an amphipathic helix. Such a helix is of the appropriate size to fit within the hTAF4-TAFH binding groove. Compared with the LxxLL motif recognized by the ETO-TAFH domain, the DΨΨζζΨΦ motif is longer and more hydrophobic, consistent with the observed differences in the size and shape of the hTAF4-TAFH binding cleft from those of the ETO-TAFH domain.
We again searched the Swiss-Prot/TrEMBL protein database for sequences matching the DΨΨζζΨΦ motif (including D requirement at position 0) to identify potential binding targets of the hTAF4-TAFH domain. The search yielded 556 human genes, but a majority of these were membrane-associated or secreted factors and thus, unlikely to be relevant to hTAF4-TAFH. A number of proteins related to transcription regulation were identified as potential targets. These include activators, repressors, mediator, chromatin modifiers, and kinases (SI Table 5). Interactions with these factors would be consistent with the function of human TAF4 as a coactivator.
Interactions between the hTAF4-TAFH domain and two potential targets listed in SI Table 5 (transcriptional activators ZF and LZIP) were tested by using in vitro GST pull-down assays. The ZF protein interacts with GST-hTAF4-TAFH but not with GST alone. In addition, a mutant version of the ZF protein (designated as polyAla ZF mutant), in which the TAF4-TAFH binding motif was replaced by seven alanine residues (DLLQRLL → AAAAAAA), could not be pulled down by GST-hTAF4-TAFH (Fig. 3). A longer version of TAF4, incorporating the TAFH domain (387–835), also was capable of interacting with full-length ZF (SI Fig. 7). A GST pull-down assay with in vitro-translated Gal4-LZIP also showed similar interactions between LZIP and GST-hTAF4-TAFH (data not shown). These results suggest that hTAF4-TAFH/Zhangfei or LZIP interactions are mediated via the hydrophobic motif identified in our phage display selection and that these motifs are exposed within the native ZF or LZIP proteins.
Docking of the ZF Peptide to hTAF4-TAFH.
To model the recognition of the DΨΨζζΨΦ motif by the hTAF4-TAFH domain, a peptide corresponding to residues 77–86 of ZF (sequence FSDLLQRLLN) was docked to hTAF4-TAFH by using Insight (Accelrys, San Diego, CA) (17). In our model, the ZF peptide adopts a distorted helical conformation that exhibits good complementarity to the TAF4-TAFH structure. The ZF peptide is aligned with its N terminus toward the aromatic region and the C terminus toward pockets 1 and 2 (Fig. 4). L80 and L81 make hydrophobic interactions with F599 and L653 in the aromatic region of the binding groove. The sidechain of D79 is oriented toward the conserved K595 of hTAF4-TAFH (Fig. 1C), making an electrostatic interaction. Toward the C terminus, L84 and L85 sit in pockets 2 and 1, respectively. Notably, the size of pocket 1 is large enough to accommodate hydrophobic residues with aromatic rings such as F, W, or Y, consistent with the results of our phage display selection and the derived hydrophobic binding motif, DΨΨζζΨΦ.
Discussion
Even though the high degree of conservation of the hTAF4-TAFH domain suggests an important biological role and the deletion mutations removing the central portion of TAF4 show a significant reduction in transcriptional activation by Sp1 (N.T., unpublished data), the function of the TAFH domain in the context of TFIID has remained elusive. Our results shed new light on potential targets of the hTAF4-TAFH domain and its possible functional roles.
Recent studies have demonstrated that TAF4 acts as a keystone subunit to maintain the stability of the TFIID complex (9). Although our Scanprosite analysis revealed that several TAF subunits, such as TAF5, TAF6, and TAF13, and even TAF4 itself, contain sequences matching the DΨΨζζΨΦ recognition motif, it seems unlikely that the TAF4-TAFH domain is required to assemble the TFIID complex. Transient expression of TAF4 derivatives in mammalian cells demonstrated that the conserved region II (CII), but not the TAFH domain, was required for the TFIID complex assembly (18). In the studies by Wright et al. (9), the CII region was found to be sufficient for forming a stable core subcomplex and nucleation of a holo-TFIID complex. Furthermore, in the homologous Saccharomyces cerevisiae TFIID complex, TAF4 does not contain a TAFH domain at all, arguing against an integral role in complex formation.
A more likely function for the hTAF4-TAFH domain might involve a role as a classical coactivator or corepressor, analogous to the observed corepressor function of ETO-TAFH and consistent with the initial functional associations of the TAF4 subunit. Our motif search identified a number of transcriptional activators that would be consistent with a role in metazoan development. For example, homeodomain proteins belonging to the PBX family that play roles in limb development and hematopoiesis (PBX1, PBX2, and PBX3) are found to have a hTAF4-TAFH binding motif within the N-terminal region of the PBX domain. The HCF-1-dependent ZF and LZIP activators were also identified as containing the hTAF4-TAFH binding motif. HCF-1 is an abundant chromatin-associated protein that plays important roles in herpes simplex virus transcription and cell proliferation and has been demonstrated to associate with activators including LZIP, ZF, E2F4, VP16, and Sp1 and chromatin modifiers such as Set1/Ash2 HMT and Sin3 HDAC (19). In the case of LZIP, recent studies have suggested that in addition to HCF-1, other cellular factors are required for transcriptional activation, given that disruption of the LZIP transcriptional activation domain by mutating the DLLxxLL motif to DLLxxAA significantly reduced transactivation, but the interaction between LZIP and the HCF-1 β-propeller was not affected (20). Our binding data revealed the ability of the hTAF4-TAFH domain to interact with both the LZIP and ZF proteins via DLLxxLL motifs, suggesting that a coactivator complex including hTAF4 and HCF-1 may be formed during transcriptional activation by the ZF and LZIP activators. These data suggest that TAF4 may be a key cellular factor involved in coactivating HCF-1-dependent activators such as LZIP, ZF, or Sp1.
Also among the list of potential hTAF4-TAFH interaction partners are a number of enzymes involved in posttranslational modifications or remodeling of chromatin substrates including the mediator subunit CRSP 130, several histone deacetylases, demethylases, and a number of kinases involved in cell cycle control.
Although TAFH domains from hTAF4 and ETO share significant sequence similarity, the previously undescribed binding motif DΨΨζζΨΦ identified for the hTAF4-TAFH domain is distinct from the LxxLL motif of the ETO-TAFH domain. One of the most striking findings revealed in our study is how the hTAF4-TAFH and ETO-TAFH domains recognize their specific motifs by using binding grooves composed of highly conserved residues. Our detailed structural comparison indicates that the different target specificities result from the unique packing of helix 5, which is present only in the hTAF4-TAFH domain. The presence of the fifth helix and the subsequent rearrangement of the hTAF4-TAFH hydrophobic core residues lead to dramatic shifts in the positions of the substrate recognition residues F599 and F656. As a result, a very flat and wide binding surface is presented in the hTAF4-TAFH domain, allowing it to recognize a more expanded hydrophobic motif than that of ETO-TAFH.
Remarkably, in our alignment of known TAFH domains, we found that the proline (P669) that allows the formation of helix 5 is present only in vertebrate TAF4 and TAF4b subunits. In invertebrates such as sea urchin, Drosophila, and Caenorhabditis elegans, this proline residue is missing and there are several other changes to the hydrophobic core residues. We expect that the invertebrate TAF4-TAFH domains will be more similar to the ETO-TAFH domain. This might reflect changes in target specificity within the TAF4-TAFH family, perhaps allowing fine tuning of transcriptional response to activators such as homeodomain proteins that are important for metazoan patterning. It appears that the TAFH domain might be a portion of the TFIID complex that has diverged in response to differing developmental programs. Interestingly, the entire TAF4 subunit itself has changed significantly as it has evolved. The functional homologue of TAF4 in yeast is a single gene in which the TAFH domain is absent, and in invertebrates, TAF4 is a single gene with a TAFH domain similar to the topological organization observed in the ETO-TAFH domain. However, in mammals, the TAF4 gene has been duplicated, resulting in the presence of TAF4b, an ovarian cell-specific subunit of TFIID. These changes within TAF4 are consistent with an important role for this subunit of TFIID as a transcriptional coactivator.
In summary, the specific and unique binding motif of the hTAF4-TAFH domain identified in our study provides valuable insights into the function of TAFH domain in the regulation of gene expression. Further studies using genetic and other techniques will be required to reveal additional biological functions for the hTAF4-TAFH domain.
Materials and Methods
Cloning, Expression, and Purification.
Human TAF4-TAFH domain (residues 575–688) was amplified with PCR and inserted into a N-terminal GST vector (pDEST15; Invitrogen, Carlsbad, CA). The plasmids of mutants (F599R, L603E, V620R, and F656R) were generated with QuikChange site-directed mutagenesis kit (Stratagene, La Jolla, CA). The GST-tagged hTAF4-TAFH proteins were overexpressed in Escherichia coli and purified over glutathione Sepharose and cation-exchange columns. Selenomethionine-substituted hTAF4-TAFH was grown in M9 minimal-medium supplemented with glucose and thiamine and purified as the native protein, with the addition of 5 mM DTT to all purification buffers.
The ZF gene was amplified from a HeLa cDNA library by PCR and inserted into an N-terminal maltose-binding protein (MBP) vector. PolyAla ZF mutant was generated where residues 79–85 were mutated from DLLQRLL to AAAAAAA with a two-step PCR protocol. Mutant and wild-type MBP-tagged ZF proteins were overexpressed in E. coli and purified with amylose resin. Both wild-type and mutant ZF were soluble as MBP fusions; however, treatment of the wild-type ZF with TEV protease in 500 mM NaCl buffer resulted in precipitation of the free ZF protein. The pellet containing ZF could be resolubilized at low ionic strength. The wild-type ZF pellet was dissolved with water and dialyzed against 20 mM Mes (pH 6.3)/1 mM EDTA/1 mM DTT. Correct secondary structure of the ZF protein obtained from the pellet was verified by circular dichroism spectroscopy and shown to be α-helical. Removal of the DLLXXLL motif in the polyAla ZF mutant resulted in a protein that was soluble when cleaved from MBP. To purify the mutant, the MBP tag was cleaved with tobacco-etch virus protease on the amylose resin, and trace-free MBP was removed by reloading the mixture of MBP and the polyAla ZF mutant onto amylose resin.
Methylation and Crystallization.
Selenomethionine-substituted hTAF4-TAFH was subjected to reductive alkylation of lysines as described by Rayment (21). Crystals of methylated hTAF4-TAFH containing selenomethionine were grown by hanging-drop vapor diffusion at 22°C by mixing equal volumes of protein solution and crystallization buffer (200 mM ammonium acetate/2.2 M ammonium sulfate/150 mM sodium bromide). Crystals were flash-frozen in liquid nitrogen by using mother liquor plus 20% (wt/vol) d-sucrose as a cryoprotectant.
Data Collection and Structure Determination.
Multiwavelength anomalous dispersion data were collected from a single crystal at 100 K by using beamline 8.3.1 of the Advanced Light Source (Lawrence Berkeley National Laboratory, Berkeley, CA). Data were analyzed with the CCP4 set of crystallographic programs (22) by using scripts generated by ELVES (23). One selenium site was identified. Phases were generated with MLPHARE and improved through solvent flattening and histogram matching with DM (24). The model was built with O (25) and refined with REFMAC (26). The Protein Data Bank ID code is 2P6V. Structural figures were generated with PyMOL (www.pymol.org).
Peptide Synthesis.
Peptides were made by standard solid-phase peptide synthesis, purified by reverse-phase HPLC (W. M. Keck Biotechnology Resource Center, New Haven, CT; M. D. Anderson Cancer Center Core Facility), and analyzed by using liquid chromatography MS. All peptides contained free N termini and amidated C termini.
Phage Display.
The hTAF4-TAFH or GST were cross-linked to MagnaBind carboxyl derivatized beads (Pierce, Rockford, IL). Nonspecific binding phage were removed by mixing 1.5 × 1011 pfu of a phage display library (Ph.D.-12; New England Biolabs, Ipswich, MA) with 50 μl of GST cross-linked to beads in Tris-buffered saline-Tween buffer [TBS + 0.1% (vol/vol) Tween-20]. The supernatant was added to 50 μl of the hTAF4-TAFH cross-linked beads. After a 1-h incubation at room temperature, the beads were washed 10 times with Tris-buffered saline-Tween buffer. Bound phage were eluted with 200 μl of 1 mM synthesized E2A-AD1 peptide, titered, and amplified. During the fifth round of panning, the concentration of Tween-20 was increased to 0.5%, and bound phage were eluted with 200 μl of 200 mM glycine·HCl (pH 2.2)/1 mg/ml BSA for 10 min and neutralized with 150 μl of 1 M Tris·HCl. Forty-four plaques were picked and single-stranded DNA was purified for sequencing.
ITC.
Calorimetric titrations were carried out at 37.0 ± 0.1°C with an ITC instrument from MicroCal (Northampton, MA). hTAF4-TAFH, or its mutants, was dialyzed extensively against 50 mM Tris·HCl (pH 7.9) or Mes (pH 5.8), 50 mM NaCl, 1 mM EDTA, and 1 mM TCEP buffer. Synthesized peptides were dissolved in the same dialysis buffer. Data obtained from peptide injections into 1.4 ml of a buffer blank were subtracted from the experimental data before analysis with the MicroCal Origin 5.0 software.
GST Pull-Down Assay.
GST pull-down assays were performed by incubating equal amounts of GST-hTAF4-TAFH or GST alone bound to glutathione-Sepharose beads with wild-type or mutant ZF. The mixture was incubated with rotation for 3 h at 4°C and washed four times with buffer (20 mM Mes, pH 6.3/0.1% Nonidet P-40/1 mM EDTA/1 mM DTT). Bound proteins were eluted with SDS/PAGE loading buffer and resolved by SDS/12% PAGE.
Peptide Docking Calculation.
An α-helical conformation of the ZF peptide (FSDLLQRLLN) was generated with Biopolymer and docked to hTAF4-TAFH by using the grid docking method as implemented in the Affinity module of Insight II (Accelrys, San Diego, CA). Residues forming the binding groove of the hTAF4-TAFH domain including K595, N598, F599, T602, L603, L606, A607, Q612, T616, V620, L641, Y642, L645, S647, Q650, Y652, L653, and F656 were used for docking. The consistent valence force field was used with standard parameters. A full-energy minimization was carried out with three conjugate gradient minimization steps. Three docked models with the lowest energy were generated.
Acknowledgments
We thank Drs. Richard G. Brennan, Angus Wilson, and Suparna Bhatta-charya for their thoughtful review of and comments on the manuscript and Drs. James Holton and George Meig of Beamline 8.3.1 at the Advanced Light Source (Berkeley, CA). This work was supported by National Institutes of Health Grant GM069769 and the Burroughs Wellcome Fund.
Abbreviations
- TAF
TBP-associated factors
- ITC
isothermal titration calorimetry
- MBP
maltose-binding protein.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org (PDB ID code 2P6V).
This article contains supporting information online at www.pnas.org/cgi/content/full/0608570104/DC1.
References
- 1.Tora L. Genes Dev. 2002;16:673–675. doi: 10.1101/gad.976402. [DOI] [PubMed] [Google Scholar]
- 2.Jacobson RH, Ladurner AG, King DS, Tjian R. Science. 2000;288:1422–1425. doi: 10.1126/science.288.5470.1422. [DOI] [PubMed] [Google Scholar]
- 3.Mizzen CA, Yang XJ, Kokubo T, Brownell JE, Bannister AJ, Owen-Hughes T, Workman J, Wang L, Berger SL, Kouzarides T, et al. Cell. 1996;87:1261–1270. doi: 10.1016/s0092-8674(00)81821-8. [DOI] [PubMed] [Google Scholar]
- 4.Zhou J, Zwicker J, Szymanski P, Levine M, Tjian R. Proc Natl Acad Sci USA. 1998;95:13483–13488. doi: 10.1073/pnas.95.23.13483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Shen WC, Bhaumik SR, Causton HC, Simon I, Zhu X, Jennings EG, Wang TH, Young RA, Green MR. EMBO J. 2003;22:3395–3402. doi: 10.1093/emboj/cdg336. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tanese N, Saluja D, Vassallo MF, Chen JL, Admon A. Proc Natl Acad Sci USA. 1996;93:13611–13616. doi: 10.1073/pnas.93.24.13611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Saluja D, Vassallo MF, Tanese N. Mol Cell Biol. 1998;18:5734–5743. doi: 10.1128/mcb.18.10.5734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Mengus G, May M, Carre L, Chambon P, Davidson I. Genes Dev. 1997;11:1381–1395. [Google Scholar]
- 9.Wright KJ, Marr MT, II, Tjian R. Proc Natl Acad Sci USA. 2006;103:12347–12352. doi: 10.1073/pnas.0605499103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Dunah AW, Jeong H, Griffin A, Kim YM, Standaert DG, Hersch SM, Mouradian MM, Young AB, Tanese N, Krainc D. Science. 2002;296:2238–2243. doi: 10.1126/science.1072613. [DOI] [PubMed] [Google Scholar]
- 11.Gangloff YG, Werten S, Romier C, Carre L, Poch O, Moras D, Davidson I. Mol Cell Biol. 2000;20:340–351. doi: 10.1128/mcb.20.1.340-351.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Davis JN, McGhee L, Meyers S. Gene. 2003;303:1–10. doi: 10.1016/s0378-1119(02)01172-1. [DOI] [PubMed] [Google Scholar]
- 13.Zhang J, Kalkum M, Yamamura S, Chait BT, Roeder RG. Science. 2004;305:1286–1289. doi: 10.1126/science.1097937. [DOI] [PubMed] [Google Scholar]
- 14.Plevin MJ, Zhang J, Guo C, Roeder RG, Ikura M. Proc Natl Acad Sci USA. 2006;103:10242–10247. doi: 10.1073/pnas.0603463103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gattiker A, Gasteiger E, Bairoch A. Appl Bioinformatics. 2002;1:107–108. [PubMed] [Google Scholar]
- 16.Aasland R, Abrams C, Ampe C, Ball LJ, Bedford MT, Cesareni G, Gimona M, Hurley JH, Jarchau T, Lehto VP, et al. FEBS Lett. 2002;513:141–144. doi: 10.1016/s0014-5793(01)03295-1. [DOI] [PubMed] [Google Scholar]
- 17.Luty BA, Wasserman ZR, Stouten PFW, Hodge CN, Zacharias M, McCammon JA. J Comput Chem. 1995;16:454–464. [Google Scholar]
- 18.Furukawa T, Tanese N. J Biol Chem. 2000;275:29847–29856. doi: 10.1074/jbc.M002989200. [DOI] [PubMed] [Google Scholar]
- 19.Wysocka J, Herr W. Trends Biochem Sci. 2003;28:294–304. doi: 10.1016/S0968-0004(03)00088-4. [DOI] [PubMed] [Google Scholar]
- 20.Luciano RL, Wilson AC. Proc Natl Acad Sci USA. 2000;97:10757–10762. doi: 10.1073/pnas.190062797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Rayment I. Methods Enzymol. 1997;276:171–179. [PubMed] [Google Scholar]
- 22.Collaborative Computational Project, no 4. Acta Crystallogr D. 1994;50:760–763. [Google Scholar]
- 23.Holton J, Alber T. Proc Natl Acad Sci USA. 2004;101:1537–1542. doi: 10.1073/pnas.0306241101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Cowtan KD, Main P. Acta Crystallogr D. 1996;52:43–48. doi: 10.1107/S090744499500761X. [DOI] [PubMed] [Google Scholar]
- 25.Jones TA, Zou JY, Cowan SW, Kjeldgaard Acta Crystallogr A. 1991;47:110–119. doi: 10.1107/s0108767390010224. [DOI] [PubMed] [Google Scholar]
- 26.Murshudov GN, Vagin AA, Dodson EJ. Acta Crystallogr D. 1997;53:240–255. doi: 10.1107/S0907444996012255. [DOI] [PubMed] [Google Scholar]