Abstract
The past 4 decades have seen remarkable advances in our understanding of the structural basis of gene regulation. Technological advances in protein expression, nucleic acid synthesis, and structural biology made it possible to study the proteins that regulate transcription in the context of ever larger complexes containing proteins bound to DNA. This review, written on the occasion of the 50th anniversary of the founding of the Protein Data Bank focuses on the insights gained from structural studies of protein–DNA complexes and the role the PDB has played in driving this research. I cover highlights in the field, beginning with X-ray crystal structures of the first DNA-binding domains to be studied, through recent cryo-EM structures of transcription factor binding to nucleosomal DNA.
Keywords: DNA-binding proteins, protein structure, nucleic acid structure, transcription factor, DNA–protein interaction, gene regulation
Abbreviations: CAP, catabolite activator protein; PDB, Protein Data Bank; TBP, TATA-binding protein
The publication in the early 1980s of the first crystal structures of DNA and of proteins that bind to specific DNA sequences (1, 2, 3) marked a turning point in structural biology. X-ray crystallography had already made a profound impact on biology and biochemistry (4), beginning with the first atomic models of hemoglobin (5) and myoglobin (6), to the first structures of enzymes (7), antibodies (8), and tRNA (9, 10). At the same time, the need for large amounts of material to grow crystals of sufficient size and quantity restricted the field to naturally abundant proteins. With the exception of tRNA, it was not possible to obtain the homogeneous samples of RNA or DNA needed for crystallization trials. The advent of molecular cloning and strategies for overexpressing proteins in bacteria, however, dramatically increased the number and types of proteins whose structures could be determined. The publication in the early 1980s of structures of E. coli catabolite activator protein (CAP) (2) and of the bacteriophage lambda Cro (1) and cI (3) repressor proteins was electrifying and provided the first glimpses of how proteins might bind DNA and regulate transcription. The much broader set of biological problems to which structural methods could now be applied greatly increased the interest in structural biology. At the same time, the development of chemical methods to synthesize DNA oligonucleotides of defined length and sequence made it possible to crystallize and determine structures of DNA (11, 12), as well as of protein–DNA complexes. Indeed, it was the 1981 publication of the crystal structure of a B-DNA dodecamer by Dickerson and colleagues (1BNA) (12) that finally provided experimental proof for the B-DNA model proposed by Watson and Crick in 1953 (13). These combined developments in recombinant DNA technology and chemical synthesis of DNA marked the beginning of a new era in studies of protein–DNA interactions and gene regulation.
The advances in cloning and oligonucleotide synthesis played an additional role in expanding the impact of structural biology beyond simply making it possible to determine structures of protein–DNA complexes. The development of approaches that utilized oligonucleotides to engineer specific amino acid substitutions into proteins (14) meant that one could use structural information to introduce mutations that could be then be used to test mechanistic hypotheses based on crystal structures. An early example was the test of a model for how the helix-turn-helix element (15), which had been identified in early structures of DNA-binding proteins, mediated contacts with DNA base pairs. Site-directed mutagenesis of the bacteriophage 434 repressor validated the proposed model for DNA binding and provided clues as to how side chain contacts determined DNA sequence recognition (16). These new approaches that made it possible to use structural information to drive biochemical and genetic studies further broadened interest in structural biology and helped fuel a dramatic expansion in what had once been a relatively small community of X-ray crystallographers and NMR spectroscopists.
The ability to utilize the new structural information on DNA–protein complexes was, however, limited because many of these new structures were not broadly available. Although the Protein Data Bank (PDB) had been established more than a decade earlier, coordinate deposition was voluntary and many structures of proteins and oligonucleotides were not publicly available (17). Indeed, coordinates for the first DNA-binding proteins mentioned above, CAP (2), lambda cro (1), and lambda cI (3), were not deposited in the PDB. Recommendations from the International Union of Crystallography (18) and policy changes at the National Institutes of Health (19) and other funding entities led to mandatory coordinate deposition, making these exciting structures available to all investigators. The number and complexity of protein–nucleic acid complex structures have increased by many orders of magnitude since that time, fueled by technical advances in X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy and, most recently, cryo-electron microscopy (cryo-EM). The availability in the PDB of so many structures of individual transcription factors, enzymes, and nucleosomes has greatly facilitated structure determination of large complexes that contained many of these macromolecules. Most importantly, these structures are easily accessible to all outside the structural biology community and continue to drive new science.
This review focuses on the insights into the regulation of transcription gained from structural studies of protein–DNA complexes and the role the PDB has played in driving this research. I present a historical view of some of the milestones, beginning with structural studies of bacterial and phage repressor proteins bound to DNA, through structures of larger complexes determined by cryo-EM. I have provided the PDB ID in either the text or figure legend for each structure mentioned. Alas, a number of early structures were never deposited in the PDB, so in these cases I also provide a reference to a subsequent structure, along with its corresponding PDB ID. Given its focus on regulation, this review focuses on sequence-specific DNA-binding proteins and does not cover the structural studies of RNA polymerase or of the many transcription factors and chromatin-modifying enzymes required for transcription initiation and elongation. The reader is referred to several recent reviews that cover the remarkable structures of the eukaryotic (20, 21) and bacterial (22) transcription machinery.
Recognition of specific DNA sequences
Regulation of specific genes depends on proteins that can recognize a particular sequence of DNA base pairs in a regulatory region. In bacteria, these proteins either activate or repress transcription by directly interacting with RNA polymerase (23). In eukaryotes, transcriptional regulators have separate domains that may recruit coactivator or corepressor complexes that attach or remove posttranslational modifications from histone, reposition nucleosomes, or promote assembly of the transcription preinitiation complex (24). Just a few years after structures of the first isolated DNA-binding domains mentioned above were elucidated (1, 2, 3), the first protein–DNA complexes reported in the mid-1980s marked the beginning in our understanding of the molecular basis for recognition of specific DNA sequences. Structures of complexes with the bacteriophage lambda (1LMB) (25, 26) and 434 repressors (2OR1) (27, 28) and cro (3CRO, 4CRO) (29, 30, 31) proteins showed how the second helix in the previously identified helix–turn–helix motif (15) inserted into the major groove of B-DNA (Fig. 1A). Side chains in the recognition helix contacted the edges of the DNA bases directly or via water-mediated hydrogen bonds, thereby contributing to sequence specificity, while other regions of the protein formed additional stabilizing contacts with the sugar–phosphate backbone. Although the bacteriophage repressors bound to relatively straight DNA, it turned out that the E. coli CAP protein (1CGP) induces a dramatic 90° bend in the helix axis (32) (Fig. 1B). This would be the first of many examples of proteins that induce bends and other distortions in the DNA that modulate the nature of sequence-specific contacts as well as (in most cases) increasing the buried surface area between protein and DNA. The helix-turn-helix motif was soon found in eukaryotic homeodomain proteins such as Drosophila engrailed (1HDD) (33) (Fig. 1, C and D) and yeast MATα2 (1APL) (34), although the longer recognition helix in homeodomains docked on DNA in a somewhat different manner. In general, all of these structures provided different examples of proteins that form chemically complementary interfaces with the DNA.
An unexpected twist on the nature of DNA sequence recognition emerged with the structure of the bacterial Trp repressor bound to DNA (35). Although Trp repressor also contains a helix–turn–helix, the protein forms no direct contacts with DNA bases. Instead, there are water-mediated contacts between Trp repressor and the base pairs in the major groove (Fig. 1E), with direct contacts formed only with the DNA backbone. The DNA sequence specificity of Trp repressor derives from sequence-dependent variations in DNA structure, a form of recognition termed indirect readout (35). Subsequent analyses have shown that sequence-dependent local variations in DNA structure play a role in a broad array of proteins that bind DNA (36).
As more structures of complexes were determined, the remarkable structural diversity of sequence-specific DNA binding domains and the different modes of interaction with both the major and minor grooves quickly became evident. The early 1990s saw a veritable explosion in the number of novel DNA-binding domains. The structure of the DNA-bound bacterial Met repressor (1CMA) (37) revealed that a pair of beta strands fit in the major groove (Fig. 1F) just as well as an α helix. This validated a prediction, made well before any structures of DNA-binding proteins had been determined, that both α helices and β sheets had the optimal dimensions to fit in the major groove of B-DNA (38). Structures of eukaryotic transcriptional regulators such as the basic region-leucine zipper (bZIP) (39) (Fig. 2A), helix–loop–helix (40) (Fig. 2B), Gal4-type zinc binding domain (41) (Fig. 2C), and the immunoglobulin-like Rel homology domain (42, 43) (Fig. 2D) proteins represented yet other structurally distinct modes of docking on DNA and recognizing specific DNA sequences.
Perhaps the most unexpected finding from this era was the discovery of the dramatic DNA distortion induced by the eukaryotic TATA-binding protein (TBP), a subunit of the basal transcription factor complex, TFIID, that binds to the TATA box promoter element and helps nucleate assembly of the transcription preinitiation complex (44). In marked contrast to the proteins that insert helixes, strands, or loops into the DNA grooves, essentially forming a structurally complementary surface (see Figs. 1 and 2), it is the concave surface of TBP that contacts DNA in the minor groove (1YTB, 1VTL) (45, 46) (Fig. 2E). A severe distortion in the DNA, which contains a nearly 90° bend in the helix axis and is underwound, enables the concave surface of TBP to form sequence-specific contacts with bases in the minor groove.
Zinc finger proteins were distinct from other classes of DNA-binding domains in their modular recognition of DNA sequences, and whose molecular details were first revealed in the structure of the three zinc fingers of Zif268 bound to DNA (1ZAA) (47) (Fig. 3A). Members of this large family of transcriptional regulators contain multiple tandem repeats of the ~33 amino acid domain with a structural zinc coordinated by two histidine and two cysteine side chains (48), with each zinc finger recognizing 3 to 4 base pairs (47) (Fig. 3A). The modular nature of zinc finger proteins presented an opportunity to engineer proteins with particular DNA-binding specificities (49, 50, 51, 52), which could then be used to target nucleases or other domains to specific sites in the genome (53, 54). This marked the first attempt at targeted genome engineering, which was followed a decade later by designed TAL effector proteins (55). Each repeat in these plant DNA-binding proteins recognizes a single base pair (56) (Fig. 3B), which greatly facilitated design of proteins with the desired DNA sequence specificity (57, 58) that could similarly be linked to endonuclease domains for genome engineering (55).
An analysis of PDB depositions as of the year 2000 (59) identified seven broad classes of sequence-specific DNA-binding proteins, with variations within each class. One of the common themes to emerge from all of these studies was the prevalence of DNA sequence recognition via contacts in the major groove, where the pattern of nucleobase functional groups is unique to each DNA sequence. Although it had initially been thought by some that there might be a recognition code in the form of a one-to-one correspondence between a particular base and one or more unique side chains, it became apparent early on that there was no such code (60), with the exception of the TAL effector proteins (57, 58). A comprehensive review of the determinants of DNA sequence recognition can be found in (61).
Combinatorial regulation of transcription
Many eukaryotic genes are regulated by multimeric complexes that can regulate transcription in response to multiple inputs. Beginning in the mid-1990s, structural studies of transcriptional regulators advanced to the next level of complexity, with structures determined of multiprotein complexes bound to DNA. One of the first was of the nuclear hormone receptor heterodimer composed of 9-cis-retinoic acid receptor (RXR) and thyroid hormone receptor (TR) (2NLL) (62) (Fig. 4A). Members of this family of DNA-binding proteins can form homodimers or heterodimers and contain separate ligand-binding domains, which change conformation upon ligand binding and recruit enzyme complexes that activate (coactivators) or repress (corepressors) transcription (63). Of interest, the RXR and TR DNA-binding domains bind DNA in tandem (62), in contrast with other members of this family, such as glucocorticoid receptor, which bind as symmetric dimers (64). Structural studies of the homeodomain superfamily revealed an even greater degree of complexity, as selected members of this family can heterodimerize with other homeodomain proteins or with DNA-binding proteins belonging to completely dissimilar structural families. The yeast MATα2 homeodomain protein, for example, can heterodimerize with a second homeodomain protein, MATa1 (1YRN) (65), or with MCM1 (1MNM) (66), a MADS box DNA-binding protein that is unrelated in structure to homeodomains (Fig. 4, B and C). Structures of Drosophila Ubx/Exd (1B8I) (67) and human HoxB1/Pbx1 (1B72) (68) homeodomain heterodimers bound to DNA provided additional insights into how transcription programs are regulated during development.
Through the late 1990s and early 2000s, structures of even larger complexes were determined. With the availability of many structures of smaller protein–DNA complexes in the PDB, it became possible to model regulatory regions and enhancers to which multiple proteins bind. The β-interferon enhancer, for example, contains binding sites within the 55-base pair enhancer sequence for eight proteins that bind cooperatively, together forming an “enhanceosome” (69). By combining structural information from several multiprotein subcomplexes (2O61, 2O6G, 1T2K), it was possible to assemble a model of the entire enhanceosome containing the bHLH proteins, C-Jun and ATF-2; the Rel homology domain proteins, p50 and RelA; and four IRF proteins, IRF-3A, 3C, 7B, and 7B (Fig. 4D) (70, 71).
Transcription factor binding and chromatin
The packaging of eukaryotic DNA into chromatin impacts all cellular processes requiring access to DNA, including transcription. It is perhaps not surprising, then, that the structure of the fundamental organizational unit of the genome, the nucleosome (1AOI), is the most highly cited structure in the PDB (72). The first high-resolution structure of the nucleosome core particle, determined in 1997 (73), revealed the molecular details of how the 146-base pair DNA duplex wraps twice around the histone octamer core, which contains two copies each of histones H2A, H2B, H3, and H4 (Fig. 5A). One of the important observations to emerge from this and subsequent studies was that the DNA is not smoothly bent but instead contains local kinks that are favored by particular DNA sequences (74). These sequence-dependent differences in the relative energetic penalty for DNA kinking and bending thus play a key role in positioning nucleosomes at particular locations in the genome (74).
Most transcriptional regulators bind to nucleosome-depleted or nucleosome-free regions, where their DNA-binding domains can freely access the DNA. A class of regulatory proteins known as pioneer transcription factors, however, bind directly to nucleosomal DNA in regions of compacted chromatin and reprogram cell fate by altering local chromatin structure (75). Two recent cryo-EM structures have provided the first insights into how pioneer transcription factors, OCT4 and SOX2, bind to DNA that is simultaneously wrapped around the nucleosome (76, 77). The position of the recognition sequence within the nucleosomal DNA, or even whether there is a fixed site, is not clear, so the two studies inserted the recognition sequence in different locations in the DNA based on solution experiments. The structure of both OCT4 and SOX2 bound to DNA near the exit site from the nucleosome (superhelical location -6, or SHL-6) show the DNA peeled away from the histone core (Fig. 5B), suggesting a mechanism by which these pioneer factors could help open chromatin (6T90) (77). In structures of SOX2 (6T7B) and of a homolog, SOX11 (6T7A), bound to an internal DNA site at SHL +2 (Fig. 5C), the DNA is somewhat distorted and bulges away from the histone core (76). Together, these structures constitute an important start in understanding the mechanism by which these and other pioneer transcription factors open chromatin and alter transcription programs.
Visualizing entire transcription initiation complexes
A more complete understanding of how DNA-binding proteins orchestrate transcription will require structures of ever-larger complexes containing all necessary components for transcription initiation. Thanks to the recent “resolution revolution” in cryo-EM (78), one structure after another of huge protein–nucleic acid complexes have provided unprecedented insights into transcription. Since virtually all of these complexes contain proteins whose structures had been determined, usually by X-ray crystallography, the ready availability of coordinates and associated data in the PDB has greatly facilitated map interpretation and model building. The many structures of basal transcription factors bound to DNA, such as TBP (1YTB, 1VTL) (45, 46), TFIIA (79), and TFIIB (80), and RNA polymerase II have provided the foundation on which to interpret structures of transcription initiation and elongation complexes (for a comprehensive recent review, see (21)). The change in the PDB coordinate data format to mmCIF/PDBx (81) was another advance that facilitated working with such large structures. The original PDB format, which was based on the number of characters that could fit on an IBM punch card, could only accommodate structures with up to 62 chains and fewer than 100,000 atoms. The mmCIF/PDBx format has no such limit and also accommodates additional metadata containing information about the macromolecules as well as experimental details. Thus, structures of a 2.7-MDa bacterial expressosome containing RNA polymerase, a ribosome, the NusG transcription factor, duplex DNA, mRNA, and tRNA could be accommodated in a single coordinate file, even though it contains 65 unique chains and more than 175,000 non-hydrogen atoms (6ZTJ, 6VU3) (82, 83). The rate at which these huge structures are emerging is so rapid that new structures of the RNA polymerase II preinitiation complex (2.6 MDa) appeared as this article was being revised (84).
The future
What advances in studies of the mechanisms underlying transcription regulation can we anticipate in the near future and how will the PDB continue to support them? As structures have become more complex, it is increasingly common for multiple methods to be used to arrive at the final model. In addition to combining information from X-ray crystallography, NMR, cryo-EM and solution X-ray scattering, data from other biophysical and biochemical methods that provide complementary information are starting to be used to interpret cryo-EM maps of very large complexes with multiple components. This approach, referred to as integrative structural biology (85), incorporates information from methods such as mass spectrometry–cross-linking, hydrogen–deuterium exchange, Förster resonance energy transfer, chromosome capture, and many others. The opportunities presented by integrative structure determination have also presented challenges to the PDB as to how the data should be archived and displayed and how the models should be validated. The PDB has ongoing efforts to address these issues (86).
An important gap in our understanding of how transcription is regulated stems from our inability to capture the dynamics and time-dependent events that link structural snapshots. It is now possible to capture multiple states in a single cryo-EM sample, but linking them temporally involves extrapolation and educated guesswork. Solution and solid-state NMR, along with development of methods for time-resolved structure determination that can be applied to large assemblies, could help fill this gap. There is also the hope that it will eventually be possible to study all these events in their natural, cellular context with further development of cryo-electron tomography (87) or other imaging techniques that have yet to be invented. With so many possibilities ahead, now is as exciting a time in structural biology as when those first few structures of DNA-binding proteins were published 40 years ago.
Conflict of interest
The author is a member of the scientific advisory board of Thermo Fisher Scientific.
Acknowledgments
I thank Daniel Panne for providing the coordinates of the β-interferon enhanceosome model. My deepest thanks to all the leaders and staff of the Protein Data Bank, past and present, who have created such a remarkable resource that benefits the entire scientific and education community.
Funding and additional information
Supported by NIGMS, National Institutes of Health grant GM130393 (C. W.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Biography
Cynthia Wolberger, Professor of Biophysics and Biophysical Chemistry at the Johns Hopkins University School of Medicine, is a leader in research on transcriptional regulation and ubiquitin signaling.
Edited by Karin Musier-Forsyth
References
- 1.Anderson W.F., Ohlendorf D.H., Takeda Y., Matthews B.W. Structure of the cro repressor from bacteriophage lambda and its interaction with DNA. Nature. 1981;290:754–758. doi: 10.1038/290754a0. [DOI] [PubMed] [Google Scholar]
- 2.McKay D.B., Steitz T.A. Structure of catabolite gene activator protein at 2.9 A resolution suggests binding to left-handed B-DNA. Nature. 1981;290:744–749. doi: 10.1038/290744a0. [DOI] [PubMed] [Google Scholar]
- 3.Pabo C.O., Lewis M. The operator-binding domain of lambda repressor: Structure and DNA recognition. Nature. 1982;298:443–447. doi: 10.1038/298443a0. [DOI] [PubMed] [Google Scholar]
- 4.Jaskolski M., Dauter Z., Wlodawer A. A brief history of macromolecular crystallography, illustrated by a family tree and its nobel fruits. FEBS J. 2014;281:3985–4009. doi: 10.1111/febs.12796. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Perutz M.F., Rossmann M.G., Cullis A.F., Muirhead H., Will G., North A.C. Structure of haemoglobin: A three-dimensional Fourier synthesis at 5.5-A. Resolution, obtained by X-ray analysis. Nature. 1960;185:416–422. doi: 10.1038/185416a0. [DOI] [PubMed] [Google Scholar]
- 6.Kendrew J.C., Dickerson R.E., Strandberg B.E., Hart R.G., Davies D.R., Phillips D.C., Shore V.C. Structure of myoglobin: A three-dimensional Fourier synthesis at 2 A. resolution. Nature. 1960;185:422–427. doi: 10.1038/185422a0. [DOI] [PubMed] [Google Scholar]
- 7.Blake C.C., Koenig D.F., Mair G.A., North A.C., Phillips D.C., Sarma V.R. Structure of hen egg-white lysozyme. A three-dimensional Fourier synthesis at 2 Angstrom resolution. Nature. 1965;206:757–761. doi: 10.1038/206757a0. [DOI] [PubMed] [Google Scholar]
- 8.Poljak R.J., Amzel L.M., Avey H.P., Chen B.L., Phizackerley R.P., Saul F. Three-dimensional structure of the Fab' fragment of a human immunoglobulin at 2,8-A resolution. Proc. Natl. Acad. Sci. U. S. A. 1973;70:3305–3310. doi: 10.1073/pnas.70.12.3305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kim S.H., Suddath F.L., Quigley G.J., McPherson A., Sussman J.L., Wang A.H., Seeman N.C., Rich A. Three-dimensional tertiary structure of yeast phenylalanine transfer RNA. Science. 1974;185:435–440. doi: 10.1126/science.185.4149.435. [DOI] [PubMed] [Google Scholar]
- 10.Robertus J.D., Ladner J.E., Finch J.T., Rhodes D., Brown R.S., Clark B.F., Klug A. Structure of yeast phenylalanine tRNA at 3 A resolution. Nature. 1974;250:546–551. doi: 10.1038/250546a0. [DOI] [PubMed] [Google Scholar]
- 11.Drew H., Takano T., Tanaka S., Itakura K., Dickerson R.E. High-salt d(CpGpCpG), a left-handed Z' DNA double helix. Nature. 1980;286:567–573. doi: 10.1038/286567a0. [DOI] [PubMed] [Google Scholar]
- 12.Wing R., Drew H., Takano T., Broka C., Tanaka S., Itakura K., Dickerson R.E. Crystal structure analysis of a complete turn of B-DNA. Nature. 1980;287:755–758. doi: 10.1038/287755a0. [DOI] [PubMed] [Google Scholar]
- 13.Watson J.D., Crick F.H. Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature. 1953;171:737–738. doi: 10.1038/171737a0. [DOI] [PubMed] [Google Scholar]
- 14.Hutchison C.A, 3rd, Phillips S., Edgell M.H., Gillam S., Jahnke P., Smith M. Mutagenesis at a specific position in a DNA sequence. J Biol Chem. 1978;253:6551–6560. [PubMed] [Google Scholar]
- 15.Sauer R.T., Yocum R.R., Doolittle R.F., Lewis M., Pabo C.O. Homology among DNA-binding proteins suggests use of a conserved super-secondary structure. Nature. 1982;298:447–451. doi: 10.1038/298447a0. [DOI] [PubMed] [Google Scholar]
- 16.Wharton R.P., Brown E.L., Ptashne M. Substituting an alpha-helix switches the sequence-specific DNA interactions of a repressor. Cell. 1984;38:361–369. doi: 10.1016/0092-8674(84)90491-4. [DOI] [PubMed] [Google Scholar]
- 17.Barinaga M. The missing crystallography data. Science. 1989;245:1179–1181. doi: 10.1126/science.2781276. [DOI] [PubMed] [Google Scholar]
- 18.Macromolecules, I. U. o. C. C. o. B. Policy on publication and the deposition of data from crystallographic studies of biological macromolecules. Acta Cryst. 1989;A45:658. [Google Scholar]
- 19.NIH . National Institutes of Health; United States: 1999. NIH Policy Relating to Deposition of Atomic Coordinates into Structural Databases. [Google Scholar]
- 20.Greber B.J., Nogales E. The structures of eukaryotic transcription pre-initiation complexes and their functional implications. Subcell Biochem. 2019;93:143–192. doi: 10.1007/978-3-030-28151-9_5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Osman S., Cramer P. Structural biology of RNA polymerase II transcription: 20 Years on. Annu. Rev. Cell Dev. Biol. 2020;36:1–34. doi: 10.1146/annurev-cellbio-042020-021954. [DOI] [PubMed] [Google Scholar]
- 22.Murakami K.S. Structural biology of bacterial RNA polymerase. Biomolecules. 2015;5:848–864. doi: 10.3390/biom5020848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Browning D.F., Busby S.J. The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. 2004;2:57–65. doi: 10.1038/nrmicro787. [DOI] [PubMed] [Google Scholar]
- 24.Weake V.M., Workman J.L. Inducible gene expression: Diverse regulatory mechanisms. Nat. Rev. Genet. 2010;11:426–437. doi: 10.1038/nrg2781. [DOI] [PubMed] [Google Scholar]
- 25.Jordan S.R., Pabo C.O. Structure of the lambda complex at 2.5 A resolution: Details of the repressor-operator interactions. Science. 1988;242:893–899. doi: 10.1126/science.3187530. [DOI] [PubMed] [Google Scholar]
- 26.Beamer L.J., Pabo C.O. Refined 1.8 A crystal structure of the lambda repressor-operator complex. J. Mol. Biol. 1992;227:177–196. doi: 10.1016/0022-2836(92)90690-l. [DOI] [PubMed] [Google Scholar]
- 27.Anderson J.E., Ptashne M., Harrison S.C. Structure of the repressor-operator complex of bacteriophage 434. Nature. 1987;326:846–852. doi: 10.1038/326846a0. [DOI] [PubMed] [Google Scholar]
- 28.Aggarwal A.K., Rodgers D.W., Drottar M., Ptashne M., Harrison S.C. Recognition of a DNA operator by the repressor of phage 434: A view at high resolution. Science. 1988;242:899–907. doi: 10.1126/science.3187531. [DOI] [PubMed] [Google Scholar]
- 29.Wolberger C., Dong Y.C., Ptashne M., Harrison S.C. Structure of a phage 434 Cro/DNA complex. Nature. 1988;335:789–795. doi: 10.1038/335789a0. [DOI] [PubMed] [Google Scholar]
- 30.Mondragon A., Harrison S.C. The phage 434 Cro/OR1 complex at 2.5 A resolution. J. Mol. Biol. 1991;219:321–334. doi: 10.1016/0022-2836(91)90568-q. [DOI] [PubMed] [Google Scholar]
- 31.Brennan R.G., Roderick S.L., Takeda Y., Matthews B.W. Protein-DNA conformational changes in the crystal structure of a lambda Cro-operator complex. Proc. Natl. Acad. Sci. U. S. A. 1990;87:8165–8169. doi: 10.1073/pnas.87.20.8165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Schultz S.C., Shields G.C., Steitz T.A. Crystal structure of a CAP-DNA complex: The DNA is bent by 90 degrees. Science. 1991;253:1001–1007. doi: 10.1126/science.1653449. [DOI] [PubMed] [Google Scholar]
- 33.Kissinger C.R., Liu B.S., Martin-Blanco E., Kornberg T.B., Pabo C.O. Crystal structure of an engrailed homeodomain-DNA complex at 2.8 A resolution: A framework for understanding homeodomain-DNA interactions. Cell. 1990;63:579–590. doi: 10.1016/0092-8674(90)90453-l. [DOI] [PubMed] [Google Scholar]
- 34.Wolberger C., Vershon A.K., Liu B., Johnson A.D., Pabo C.O. Crystal structure of a MAT alpha 2 homeodomain-operator complex suggests a general model for homeodomain-DNA interactions. Cell. 1991;67:517–528. doi: 10.1016/0092-8674(91)90526-5. [DOI] [PubMed] [Google Scholar]
- 35.Otwinowski Z., Schevitz R.W., Zhang R.G., Lawson C.L., Joachimiak A., Marmorstein R.Q., Luisi B.F., Sigler P.B. Crystal structure of trp repressor/operator complex at atomic resolution. Nature. 1988;335:321–329. doi: 10.1038/335321a0. [DOI] [PubMed] [Google Scholar]
- 36.Rohs R., West S.M., Sosinsky A., Liu P., Mann R.S., Honig B. The role of DNA shape in protein-DNA recognition. Nature. 2009;461:1248–1253. doi: 10.1038/nature08473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Somers W.S., Phillips S.E. Crystal structure of the met repressor-operator complex at 2.8 A resolution reveals DNA recognition by beta-strands. Nature. 1992;359:387–393. doi: 10.1038/359387a0. [DOI] [PubMed] [Google Scholar]
- 38.Church G.M., Sussman J.L., Kim S.H. Secondary structural complementarity between DNA and proteins. Proc. Natl. Acad. Sci. U. S. A. 1977;74:1458–1462. doi: 10.1073/pnas.74.4.1458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Ellenberger T.E., Brandl C.J., Struhl K., Harrison S.C. The GCN4 basic region leucine zipper binds DNA as a dimer of uninterrupted alpha helices: Crystal structure of the protein-DNA complex. Cell. 1992;71:1223–1237. doi: 10.1016/s0092-8674(05)80070-4. [DOI] [PubMed] [Google Scholar]
- 40.Ferré-D'Amaré A.R., Prendergast G.C., Ziff E.B., Burley S.K. Recognition by max of its cognate DNA through a dimeric b/HLH/Z domain. Nature. 1993;363:38–45. doi: 10.1038/363038a0. [DOI] [PubMed] [Google Scholar]
- 41.Marmorstein R., Carey M., Ptashne M., Harrison S.C. DNA recognition by GAL4: Structure of a protein-DNA complex. Nature. 1992;356:408–414. doi: 10.1038/356408a0. [DOI] [PubMed] [Google Scholar]
- 42.Ghosh G., van Duyne G., Ghosh S., Sigler P.B. Structure of NF-kappa B p50 homodimer bound to a kappa B site. Nature. 1995;373:303–310. doi: 10.1038/373303a0. [DOI] [PubMed] [Google Scholar]
- 43.Müller C.W., Rey F.A., Sodeoka M., Verdine G.L., Harrison S.C. Structure of the NF-kappa B p50 homodimer bound to DNA. Nature. 1995;373:311–317. doi: 10.1038/373311a0. [DOI] [PubMed] [Google Scholar]
- 44.Sainsbury S., Bernecky C., Cramer P. Structural basis of transcription initiation by RNA polymerase II. Nat. Rev. Mol. Cell Biol. 2015;16:129–143. doi: 10.1038/nrm3952. [DOI] [PubMed] [Google Scholar]
- 45.Kim J.L., Nikolov D.B., Burley S.K. Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature. 1993;365:520–527. doi: 10.1038/365520a0. [DOI] [PubMed] [Google Scholar]
- 46.Kim Y., Geiger J.H., Hahn S., Sigler P.B. Crystal structure of a yeast TBP/TATA-box complex. Nature. 1993;365:512–520. doi: 10.1038/365512a0. [DOI] [PubMed] [Google Scholar]
- 47.Pavletich N.P., Pabo C.O. Zinc finger-DNA recognition: Crystal structure of a zif268-DNA complex at 2.1 A. Science. 1991;252:809–817. doi: 10.1126/science.2028256. [DOI] [PubMed] [Google Scholar]
- 48.Miller J., McLachlan A.D., Klug A. Repetitive zinc-binding domains in the protein transcription factor IIIA from Xenopus oocytes. EMBO J. 1985;4:1609–1614. doi: 10.1002/j.1460-2075.1985.tb03825.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Liu Q., Segal D.J., Ghiara J.B., Barbas C.F., 3rd Design of polydactyl zinc-finger proteins for unique addressing within complex genomes. Proc. Natl. Acad. Sci. U. S. A. 1997;94:5525–5530. doi: 10.1073/pnas.94.11.5525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Desjarlais J.R., Berg J.M. Use of a zinc-finger consensus sequence framework and specificity rules to design specific DNA binding proteins. Proc. Natl. Acad. Sci. U. S. A. 1993;90:2256–2260. doi: 10.1073/pnas.90.6.2256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Rebar E.J., Pabo C.O. Zinc finger phage: Affinity selection of fingers with new DNA-binding specificities. Science. 1994;263:671–673. doi: 10.1126/science.8303274. [DOI] [PubMed] [Google Scholar]
- 52.Greisman H.A., Pabo C.O. A general strategy for selecting high-affinity zinc finger proteins for diverse DNA target sites. Science. 1997;275:657–661. doi: 10.1126/science.275.5300.657. [DOI] [PubMed] [Google Scholar]
- 53.Kim Y.G., Cha J., Chandrasegaran S. Hybrid restriction enzymes: Zinc finger fusions to Fok I cleavage domain. Proc. Natl. Acad. Sci. U. S. A. 1996;93:1156–1160. doi: 10.1073/pnas.93.3.1156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Porteus M.H., Carroll D. Gene targeting using zinc finger nucleases. Nat. Biotechnol. 2005;23:967–973. doi: 10.1038/nbt1125. [DOI] [PubMed] [Google Scholar]
- 55.Christian M., Cermak T., Doyle E.L., Schmidt C., Zhang F., Hummel A., Bogdanove A.J., Voytas D.F. Targeting DNA double-strand breaks with TAL effector nucleases. Genetics. 2010;186:757–761. doi: 10.1534/genetics.110.120717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Mak A.N., Bradley P., Cernadas R.A., Bogdanove A.J., Stoddard B.L. The crystal structure of TAL effector PthXo1 bound to its DNA target. Science. 2012;335:716–719. doi: 10.1126/science.1216211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Boch J., Scholze H., Schornack S., Landgraf A., Hahn S., Kay S., Lahaye T., Nickstadt A., Bonas U. Breaking the code of DNA binding specificity of TAL-type III effectors. Science. 2009;326:1509–1512. doi: 10.1126/science.1178811. [DOI] [PubMed] [Google Scholar]
- 58.Moscou M.J., Bogdanove A.J. A simple cipher governs DNA recognition by TAL effectors. Science. 2009;326:1501. doi: 10.1126/science.1178817. [DOI] [PubMed] [Google Scholar]
- 59.Luscombe N.M., Austin S.E., Berman H.M., Thornton J.M. An overview of the structures of protein-DNA complexes. Genome Biol. 2000;1 doi: 10.1186/gb-2000-1-1-reviews001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Matthews B.W. Protein-DNA interaction. No code for recognition. Nature. 1988;335:294–295. doi: 10.1038/335294a0. [DOI] [PubMed] [Google Scholar]
- 61.Rohs R., Jin X., West S.M., Joshi R., Honig B., Mann R.S. Origins of specificity in protein-DNA recognition. Annu. Rev. Biochem. 2010;79:233–269. doi: 10.1146/annurev-biochem-060408-091030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Rastinejad F., Perlmann T., Evans R.M., Sigler P.B. Structural determinants of nuclear receptor assembly on DNA direct repeats. Nature. 1995;375:203–211. doi: 10.1038/375203a0. [DOI] [PubMed] [Google Scholar]
- 63.Bain D.L., Heneghan A.F., Connaghan-Jones K.D., Miura M.T. Nuclear receptor structure: Implications for function. Annu. Rev. Physiol. 2007;69:201–220. doi: 10.1146/annurev.physiol.69.031905.160308. [DOI] [PubMed] [Google Scholar]
- 64.Luisi B.F., Xu W.X., Otwinowski Z., Freedman L.P., Yamamoto K.R., Sigler P.B. Crystallographic analysis of the interaction of the glucocorticoid receptor with DNA. Nature. 1991;352:497–505. doi: 10.1038/352497a0. [DOI] [PubMed] [Google Scholar]
- 65.Li T., Stark M.R., Johnson A.D., Wolberger C. Crystal structure of the MATa1/MAT alpha 2 homeodomain heterodimer bound to DNA. Science. 1995;270:262–269. doi: 10.1126/science.270.5234.262. [DOI] [PubMed] [Google Scholar]
- 66.Tan S., Richmond T.J. Crystal structure of the yeast MATalpha2/MCM1/DNA ternary complex. Nature. 1998;391:660–666. doi: 10.1038/35563. [DOI] [PubMed] [Google Scholar]
- 67.Passner J.M., Ryoo H.D., Shen L., Mann R.S., Aggarwal A.K. Structure of a DNA-bound Ultrabithorax-Extradenticle homeodomain complex. Nature. 1999;397:714–719. doi: 10.1038/17833. [DOI] [PubMed] [Google Scholar]
- 68.Piper D.E., Batchelor A.H., Chang C.P., Cleary M.L., Wolberger C. Structure of a HoxB1-Pbx1 heterodimer bound to DNA: Role of the hexapeptide and a fourth homeodomain helix in complex formation. Cell. 1999;96:587–597. doi: 10.1016/s0092-8674(00)80662-5. [DOI] [PubMed] [Google Scholar]
- 69.Thanos D., Maniatis T. Virus induction of human IFN beta gene expression requires the assembly of an enhanceosome. Cell. 1995;83:1091–1100. doi: 10.1016/0092-8674(95)90136-1. [DOI] [PubMed] [Google Scholar]
- 70.Panne D., Maniatis T., Harrison S.C. Crystal structure of ATF-2/c-Jun and IRF-3 bound to the interferon-beta enhancer. EMBO J. 2004;23:4384–4393. doi: 10.1038/sj.emboj.7600453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Panne D., Maniatis T., Harrison S.C. An atomic model of the interferon-beta enhanceosome. Cell. 2007;129:1111–1123. doi: 10.1016/j.cell.2007.05.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Feng Z., Verdiguel N., Di Costanzo L., Goodsell D.S., Westbrook J.D., Burley S.K., Zardecki C. Impact of the Protein Data Bank across scientific disciplines. Data Sci. J. 2020;19:1–14. [Google Scholar]
- 73.Luger K., Mader A.W., Richmond R.K., Sargent D.F., Richmond T.J. Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature. 1997;389:251–260. doi: 10.1038/38444. [DOI] [PubMed] [Google Scholar]
- 74.Andrews A.J., Luger K. Nucleosome structure(s) and stability: Variations on a theme. Annu. Rev. Biophys. 2011;40:99–117. doi: 10.1146/annurev-biophys-042910-155329. [DOI] [PubMed] [Google Scholar]
- 75.Soufi A., Garcia M.F., Jaroszewicz A., Osman N., Pellegrini M., Zaret K.S. Pioneer transcription factors target partial DNA motifs on nucleosomes to initiate reprogramming. Cell. 2015;161:555–568. doi: 10.1016/j.cell.2015.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Dodonova S.O., Zhu F., Dienemann C., Taipale J., Cramer P. Nucleosome-bound SOX2 and SOX11 structures elucidate pioneer factor function. Nature. 2020;580:669–672. doi: 10.1038/s41586-020-2195-y. [DOI] [PubMed] [Google Scholar]
- 77.Michael A.K., Grand R.S., Isbel L., Cavadini S., Kozicka Z., Kempf G., Bunker R.D., Schenk A.D., Graff-Meyer A., Pathare G.R., Weiss J., Matsumoto S., Burger L., Schübeler D., Thomä N.H. Mechanisms of OCT4-SOX2 motif readout on nucleosomes. Science. 2020;368:1460–1465. doi: 10.1126/science.abb0074. [DOI] [PubMed] [Google Scholar]
- 78.Kuhlbrandt W. Biochemistry. The resolution revolution. Science. 2014;343:1443–1444. doi: 10.1126/science.1251652. [DOI] [PubMed] [Google Scholar]
- 79.Tan S., Hunziker Y., Sargent D.F., Richmond T.J. Crystal structure of a yeast TFIIA/TBP/DNA complex. Nature. 1996;381:127–151. doi: 10.1038/381127a0. [DOI] [PubMed] [Google Scholar]
- 80.Nikolov D.B., Chen H., Halay E.D., Usheva A.A., Hisatake K., Lee D.K., Roeder R.G., Burley S.K. Crystal structure of a TFIIB-TBP-TATA-element ternary complex. Nature. 1995;377:119–128. doi: 10.1038/377119a0. [DOI] [PubMed] [Google Scholar]
- 81.Westbrook J.D., Bourne P.E. STAR/mmCIF: An ontology for macromolecular structure. Bioinformatics. 2000;16:159–168. doi: 10.1093/bioinformatics/16.2.159. [DOI] [PubMed] [Google Scholar]
- 82.Webster M.W., Takacs M., Zhu C., Vidmar V., Eduljee A., Abdelkareem M., Weixlbaumer A. Structural basis of transcription-translation coupling and collision in bacteria. Science. 2020;369:1355–1359. doi: 10.1126/science.abb5036. [DOI] [PubMed] [Google Scholar]
- 83.Wang C., Molodtsov V., Firlar E., Kaelber J.T., Blaha G., Su M., Ebright R.H. Structural basis of transcription-translation coupling. Science. 2020;369:1359–1365. doi: 10.1126/science.abb5317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Chen X., Qi Y., Wu Z., Wang X., Li J., Zhao D., Hou H., Li Y., Yu Z., Liu W., Wang M., Ren Y., Li Z., Yang H., Xu Y. Structural insights into preinitiation complex assembly on core promoters. Science. 2021;372 doi: 10.1126/science.aba8490. [DOI] [PubMed] [Google Scholar]
- 85.Rout M.P., Sali A. Principles for integrative structural biology studies. Cell. 2019;177:1384–1403. doi: 10.1016/j.cell.2019.05.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Sali A., Berman H.M., Schwede T., Trewhella J., Kleywegt G., Burley S.K., Markley J., Nakamura H., Adams P., Bonvin A.M., Chiu W., Peraro M.D., Di Maio F., Ferrin T.E., Grunewald K. Outcome of the first wwPDB hybrid/integrative methods task force workshop. Structure. 2015;23:1156–1167. doi: 10.1016/j.str.2015.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Turk M., Baumeister W. The promise and the challenges of cryo-electron tomography. FEBS Lett. 2020;594:3243–3261. doi: 10.1002/1873-3468.13948. [DOI] [PubMed] [Google Scholar]