Abstract
A general strategy is described for designing proteins that self assemble into large symmetrical nanomaterials, including molecular cages, filaments, layers, and porous materials. In this strategy, one molecule of protein A, which naturally forms a self-assembling oligomer, An, is fused rigidly to one molecule of protein B, which forms another self-assembling oligomer, Bm. The result is a fusion protein, A-B, which self assembles with other identical copies of itself into a designed nanohedral particle or material, (A-B)p. The strategy is demonstrated through the design, production, and characterization of two fusion proteins: a 49-kDa protein designed to assemble into a cage approximately 15 nm across, and a 44-kDa protein designed to assemble into long filaments approximately 4 nm wide. The strategy opens a way to create a wide variety of potentially useful protein-based materials, some of which share similar features with natural biological assemblies.
A central goal of nanotechnology is to design and fabricate novel materials with sizes in the nanometer range (1). These materials fall into different architectural classes, such as compact clusters (2), hollow shells (3–5) and tubes (6–8), two-dimensional layers (9, 10), and nanoporous materials (11–13). In recent years, a wide variety of chemical building blocks and synthetic strategies have been investigated. Specific methods have produced interesting new materials, but a single general approach for fabricating materials having many different architectures and symmetries has not emerged. In the present work, we move a step closer to this fundamental goal with a method for engineering self-assembling nanomaterials by combining naturally symmetric protein components.
Symmetry provides a powerful tool for building large regular objects. Our general strategy for using symmetry to construct protein nanomaterials is illustrated in Fig. 1. Many natural proteins have evolved on their own to self associate through noncovalent interactions. Those that associate two at a time (dimers) or three at a time (trimers) are relatively common (Fig. 1a). These natural proteins make up our raw building materials. We refer to them here as oligomerization domains. Following specific geometric rules, two oligomerization domains are connected by genetic manipulation into a single larger molecule called a fusion protein. A fusion protein therefore carries two parts, each with a strong tendency to associate with other copies of itself. As a consequence of this design, many identical copies of the fusion protein self assemble into a highly symmetric object or extended material we call a protein nanohedron or a protein nanohedral material.
A rich variety of nanohedral structures can be designed that take forms such as cages and shells or unbounded layers, filaments, or three-dimensional crystals. We refer to these as separate architectural classes. Within each architectural class, several different symmetries may be possible. Which symmetries and architectures can be designed with this strategy, and what geometric rules of construction must be followed to achieve them? Those that can be built by fusing together two dimeric or trimeric oligomerization domains are listed in Table 1. A fusion protein carries two virtual symmetry axes, one from each oligomerization domain (Fig. 1b). To achieve a given symmetry and architecture, there is a construction rule (Table 1) that describes the fixed geometric relationship the two symmetry axes must have with each other. For example, for a nanohedral protein layer with hexagonal (p6) symmetry, the construction rule is that the symmetry axis from the dimeric oligomerization domain and that from the trimeric oligomerization domain must be nonintersecting and parallel (Fig. 1d).
Table 1.
Symmetry* | Construction | Geometry of symmetry elements† | |
---|---|---|---|
Cages and shells | |||
T‡ | Dimer-trimer | 54.7° | I |
O§ | Dimer-trimer | 35.3° | I |
I | Dimer-trimer | 20.9° | I |
Double-layer rings | |||
Dn | Dimer-dimer | 180°/n | I |
Two-dimensional layers | |||
p6¶ | Dimer-trimer | 0° | N |
p321 | Dimer-trimer | 90° | N |
p3 | Trimer-trimer | 0° | N |
Three-dimensional crystals | |||
I213 | Dimer-trimer | 54.7° | N |
P4132 or P4332 | Dimer-trimer | 35.3° | N |
P23 | Trimer-trimer | 70.5° | N |
Helical filaments | |||
Helical‖ | Dimer-dimer | Any angle, | N |
This table gives only those symmetries that can be constructed by combining two oligomerization domains of the dimeric or trimeric type. Other kinds of oligomerization domains, such as tetramers, would give additional possibilities not listed here.
T, O, I, and Dn refer to tetrahedral, octahedral, icosahedral, and dihedral symmetry, respectively. The remaining symbols are denoted by their Hermann–Mauguin symbols.
The angle formed between the two symmetry elements is given, followed by I or N to denote intersecting or nonintersecting axes.
See Fig. 2c.
See Fig. 1e.
See Fig. 1d.
See Fig. 3b.
To satisfy the construction rule for a particular design, the two oligomerization domains in a fusion protein must be held together in a relatively rigid fashion. If the oligomerization domains are instead fused in arbitrary or flexible ways, the fusion proteins might assemble in an irregular fashion to give a material whose structure and properties cannot be anticipated. The method we used in this work to enforce the correct disposition of the separate oligomerization domains was to choose them from the set of oligomeric proteins whose three-dimensional structures are known and which begin or end in α-helices. The two oligomerization domains may then be fused by genetically engineering a connecting segment of amino acids that strongly prefers the α-helical conformation. The continuous α-helix extending from one oligomerization domain into the other provides the rigidity and directionality required to produce a fusion protein composed of two predictably connected oligomerization domains (Fig. 1c) with symmetry elements conforming to prescribed geometric rules (Table 1). This strategy has been applied successfully to produce two new protein materials.
Experimental Procedures
A Designed Cage.
To design a symmetric cage, we identified those dimeric and trimeric protein structures in the Protein Data Bank (14) that begin or end in an α-helix. By using a computer program, the dimers and trimers were considered in all pairwise combinations as potential oligomerization domains. The two oligomerization domains were connected computationally in three dimensions by a short intervening α-helical linker with a variable number of amino acid residues (Fig. 1c). Each hypothetical fusion protein model was examined to see whether the component oligomerization domains were arranged in a way that nearly satisfied the construction rules for a cage (Table 1). After checking for steric clashes in fully assembled models, several promising designs were obtained. We chose one with tetrahedral symmetry, intended to self assemble from 12 copies of a 49-kDa fusion protein.
The oligomerization domains of the fusion protein were the trimeric bromoperoxidase (ref. 15; DNA kindly provided by Karl-Heinz van Pée, Technische Universitat, Dresden, Germany) and the dimeric M1 matrix protein of influenza virus (ref. 16; DNA kindly provided by Ming Luo, University of Alabama, Birmingham) connected by a nine-residue helical linker. According to the design, the tetrahedral cage was expected to be approximately 15 nm on an edge and about 9 nm from the center to the perimeter.
PCR primers were designed to amplify a DNA fragment containing the bromoperoxidase gene [Protein Data Bank (PDB) ID code 1bro] and add to it the nine-residue linker KALEAQKQK [an amino acid sequence chosen from a portion of a long helix in ribosomal protein L9 (PDB ID code 1div)] and part of influenza virus matrix protein M1 (PDB ID code 1aa7). The final product consisted of residues 1–276 of bromoperoxidase, followed by the amino acid sequence KALEAQKQK, followed by residues 3–164 of influenza virus matrix protein M1. After ligation into a pET21b vector and bacterial transformation, the protein was produced in Escherichia coli BL21(DE3) cells growing at 37°C by inducing expression at an optical density at 600 nm of 0.6 with 1 mM isopropyl-thio-β-d-galactoside and harvesting 3 h later. Cells were lysed in a French press in buffer A [100 mM NaCl/10 mM Hepes/0.02% azide (pH 7.2)] and a mixture containing EDTA, PMSF, benzamidine, lysozyme, DNase I, and RNase A. After centrifugation, the soluble cell lysate was precipitated with ammonium sulfate. The fusion protein was abundant in the 35–40% ammonium sulfate fraction. The pellet was resuspended first in Buffer A, then run over a Q-Sepharose anion exchange column and eluted with a 0.1 M to 1 M NaCl gradient. After buffer exchange back to Buffer A by using a centrifuge concentrator, the protein was run over a nickel chelating column and eluted with buffer A plus 500 mM imidazole. Although the protein does not have a polyhistidine tail, it bound to the column with useful affinity. The protein was further purified on a Sephacryl S200h size exclusion column (Amersham Pharmacia) in Buffer A. Approximately 1 mg of pure protein was obtained per liter of cell culture.
For electron microscopy, carbon support films mounted on grids were made hydrophilic immediately before use by high-voltage, alternating current-glow discharge. Samples at 0.02 mg/ml were applied directly onto grids and allowed to adhere for 2 min. Grids were rinsed with three drops of distilled water and negatively stained with 1% (vol/vol) uranyl acetate for 1 min. Specimens were examined in a Hitachi (Tokyo) H-7000 electron microscope at an accelerating voltage of 75 kV.
For analysis by equilibrium sedimentation, a sample at 0.7 mg/ml in buffer A was spun at 4,000 rpm in a Beckman Optima XL-A-70 analytical ultracentrifuge. A curve was fitted to the data by using Beckman origin-based software (Version 3.01) to determine the molecular mass.
Dynamic light scattering was measured from a filtered sample in Buffer A at 0.3 mg/ml by using a Protein Solutions, Inc. (Charlottesville, VA) Dyna Pro Molecular Sizing instrument and DYNAMICS 4.0 software.
A Designed Filament.
A protein filament was chosen as the second design target to demonstrate the construction of an unbound nanomaterial. According to the construction rules (Table 1), the filament architecture can be built from a fusion protein with two dimeric oligomerization domains, connected in such a way that their symmetry axes do not intersect. By using the same computational strategy described above, we identified pairs that could be fused with satisfactory geometry from among a set of potential oligomerization domains (Fig. 3). The chosen design was a fusion of influenza virus matrix protein M1 (16) and carboxylesterase (ref. 17; DNA kindly provided by Ook Joon Yoo and Se Won Suh, Seoul National University, Seoul, Korea) connected by a 5-aa α-helical linker (Fig. 3). This 44-kDa fusion protein was engineered and purified from E. coli.
The carboxylesterase gene (Protein Data Bank ID code 1auo) was amplified from a plasmid by using PCR, and the DNA was extended by using primers to include the five-residue linker KDTDS (an amino acid sequence chosen from a helix connecting two domains of calmodulin) and a portion of influenza virus matrix protein M1. This DNA was digested and ligated into the pET21b vector containing influenza virus matrix protein M1. The fusion protein contained residues 1–216 of carboxylesterase, followed by the amino acid sequence KDTDS, followed by residues 3–164 of influenza virus matrix protein M1. The resulting DNA construct was then digested and ligated into the pET15b vector to add an N-terminal histidine tag.
The protein was expressed in E. coli BL21(DE3) cells at 37°C, induced with 1 mM isopropyl-thio-β-d-galactoside, and harvested 3 h later. As expected, the solubility was very low, so the protein in inclusion bodies was solubilized in 8 M urea/100 mM Tris (pH 8.0), then purified. The unfolded protein was run over a nickel chelating column and eluted with the urea buffer plus 500 mM imidazole. The purified protein was refolded by dialyzing against a refolding buffer [100 mM Tris (pH 7.8)/1.3 M urea/100 mM glycine/0.1 mM GSSG (oxidized glutathione)/1 mM reduced glutathione] and then dialyzing against PBS. Electron microscopy was performed as before, on a suspension where the concentration was approximately 1 mg/ml.
Results
Designed Cage.
After purification, the 49-kDa fusion protein, designed to form a tetrahedral cage from 12 subunits, remained soluble at concentrations up to 20 mg/ml. Various experimental methods were used to reveal the mode of assembly of the designed protein (Fig. 2). The most definitive results came from equilibrium sedimentation, which gave a shape-independent molecular mass of 550 kDa (Fig. 2b), corresponding very nearly to the expected 590 kDa for 12 subunits. Light-scattering and electron microscopy (Fig. 2a) further supported the finding that the designed protein assembled essentially as intended. Discrete particles were seen in the negative stain electron micrographs at sizes consistent with the model of the tetrahedral complex. The molecular friction coefficient was measured by dynamic light scattering. Although the frictional coefficient can be converted to a valid hydrodynamic radius only for a solid, roughly spherical particle, the observed values were consistent with the design. The frictional coefficient equated to a spherical particle with a radius between 6 and 7.5 nm, which is as close to the model value of 9 nm for the designed assembly as could be expected for a cage-like structure.
In all experiments, minor components could also be detected, some smaller and some larger than 12 subunits, reflecting association–dissociation equilibrium and possible flexibility or polymorphism in the assembled particles. Although the symmetry of the protein cage could not be verified by these experiments, the correct architectural class in the expected size range was demonstrated.
Designed Protein Filament.
The 44-kDa fusion protein which had been designed to self assemble into filaments was expressed in E. coli and extracted from insoluble inclusion bodies. The purified insoluble material was sonicated and examined by electron microscopy and found to contain essentially linear filaments, each approximately 4 nm wide. As is relatively common for natural protein filaments, these designed filaments are found arranged in networks or bundles under electron microscopy conditions (Fig. 3 c). The width of the individual filaments and the visible details are consistent with the features of the designed filament.
Discussion and Conclusions.
A general strategy for designing many types of protein assemblies was tested in the laboratory on two designs, a cage and a filament. These successful experiments demonstrate two of the possible architectural classes and thereby validate the protein-design strategy. Now, an endless variety of nanohedral protein materials can be investigated. Useful applications may arise for many of the different architectural classes. Cage-like structures can be built to follow any of the cubic point symmetries (tetrahedral, octahedral, and icosahedral) assembling respectively from 12, 24, and 60 copies of a fusion protein (Fig. 1e). Depending on the specific shapes of the oligomerization domains, some cage structures may be relatively open, whereas others may be more compact. Hollow structures may be useful for delivering drugs or genes or for stabilizing, shielding, or sequestering other molecules in their interior volumes (18, 19). More compact structures might be useful for presenting multiple copies of viral or bacterial antigens or other chemical groups on their surfaces.
Although their construction has not been demonstrated yet, the design strategy also provides for ordered protein arrays that extend indefinitely in two or three dimensions (Fig. 1d). By using different approaches, some success has been reported already with nucleic acids (10) and with a carbohydrate-binding protein designed to assemble into an ordered array (20). Well ordered molecular layers may have applications as biosensors or detectors (21). Layers and solids with large pores might also be useful as molecular sieves. Porous materials have been fabricated from silicates and more recently from metal sulfides and metal phosphates, but it has been difficult to exceed a pore diameter of roughly 1.4 nm (11, 13). Two- and three-dimensional nanohedral protein materials could have precisely defined pore sizes in the 5-nm to 20-nm range.
Many recent efforts in protein engineering have focused on redesigning the internal structures of individual protein domains. Various studies have succeeded in increasing thermal stability (22, 23), incorporating binding sites for metals or ligands (24, 25), and controlling relatively simple oligomerization (26, 27). These tend to be challenging problems, mainly because success depends on a detailed understanding of protein energetics, binding, or catalysis.
The nanohedral protein-design strategy developed here differs not only in its goal to create large symmetric assemblies, but also in the nature of the protein redesign. Here, no attempt is made to alter the internal structures of individual protein domains. Instead, it is sufficient to fuse naturally occurring proteins without making any internal modifications. Another attractive feature of the method is the combinatorial power that comes from connecting multiple protein components. In fact, this combinatorial feature is critical to satisfy the specific geometric rules of construction (Table 1).
The creation of two protein materials with very different architectures shows the potential power of this design strategy. The experiments also identify some limitations and obstacles to be addressed in future work. The filament results confirm that bacterial expression and purification may be problematic for some unbound nanohedral materials. The tetrahedral cage results show that flexibility or association–dissociation equilibria may lead to some heterogeneity in the nanostructures. Nonetheless, these first experiments point to a very promising avenue for creating a vast array of new biological materials.
Beyond their potential uses as novel materials, designed nanohedral structures could improve our understanding of many natural biological assemblies. The design and study of synthetic structures might shed light on the evolution of such diverse natural assemblies as viral capsids (28), clathrin cages (29), bacterial S-layers (21), and on the formation of protein fibers implicated in Alzheimer's disease and other amyloid pathologies (30).
Acknowledgments
We thank Mari Gingery for electron microscopy, Martin Phillips of the Department of Energy Instrumentation Facility, University of California Los Angeles for ultracentrifugation; Daniel Anderson for technical assistance; Florence Lee for expression and purification of the filament design protein; and David Eisenberg for helpful discussions. We are indebted to Ming Luo, Karl-Heinz van Pée, Ook Joon Yoo, and Se Won Suh for contributing the clones used to demonstrate the design strategy. This work was supported by the Department of Energy Grant DE-FC03-87ER60615 and the National Institutes of Health Grant GM31299.
References
- 1.Whitesides G M, Mathias J P, Seto C T. Science. 1991;254:1312–1319. doi: 10.1126/science.1962191. [DOI] [PubMed] [Google Scholar]
- 2.Herron N, Thorn D L. Adv Mater. 1998;10:1173–1184. [Google Scholar]
- 3.Kroto H W, Heath J R, O'Brien S C, Curl R F, Smalley R E. Nature (London) 1985;318:162–163. [Google Scholar]
- 4.Zhang Y, Seeman N C. J Am Chem Soc. 1994;116:1661–1669. [Google Scholar]
- 5.Huang H Y, Remsen E E, Kowalewski T, Wooley K L. J Am Chem Soc. 1999;121:3805–3806. [Google Scholar]
- 6.Iijima S. Nature (London) 1991;354:56–58. [Google Scholar]
- 7.Ghadiri M R, Granja J R, Milligan R A, McRee D E, Khazanovich N. Nature (London) 1993;366:324–327. doi: 10.1038/366324a0. [DOI] [PubMed] [Google Scholar]
- 8.Ajayan P M, Ebbesen T W. Rep Prog Phys. 1997;60:1025–1062. [Google Scholar]
- 9.Russell V A, Evans C C, Li W J, Ward M D. Science. 1997;276:575–579. doi: 10.1126/science.276.5312.575. [DOI] [PubMed] [Google Scholar]
- 10.Winfree E, Liu F, Wenzler L A, Seeman N C. Nature (London) 1998;394:539–544. doi: 10.1038/28998. [DOI] [PubMed] [Google Scholar]
- 11.Bu X, Feng P, Stucky G D. Science. 1997;278:2080–2085. doi: 10.1126/science.278.5346.2080. [DOI] [PubMed] [Google Scholar]
- 12.Chui S S Y, Lo S M F, Charmant J P H, Orpen A G, Williams I D. Science. 1999;283:1148–1150. doi: 10.1126/science.283.5405.1148. [DOI] [PubMed] [Google Scholar]
- 13.Li H L, Laine A, O'Keeffe M, Yaghi O M. Science. 1999;283:1145–1147. doi: 10.1126/science.283.5405.1145. [DOI] [PubMed] [Google Scholar]
- 14.Abola E E, Sussman J L, Prilusky J, Manning N O. Methods Enzymol. 1997;277:556–571. doi: 10.1016/s0076-6879(97)77031-9. [DOI] [PubMed] [Google Scholar]
- 15.Hecht H J, Sobek H, Haag T, Pfeifer O, van Pée K H. Nat Struct Biol. 1994;1:532–537. doi: 10.1038/nsb0894-532. [DOI] [PubMed] [Google Scholar]
- 16.Sha B D, Luo M. Nat Struct Biol. 1997;4:239–244. doi: 10.1038/nsb0397-239. [DOI] [PubMed] [Google Scholar]
- 17.Kim K K, Song H K, Shin D H, Hwang K Y, Choe S, Yoo O J, Suh S W. Structure (London) 1997;5:1571–1584. doi: 10.1016/s0969-2126(97)00306-7. [DOI] [PubMed] [Google Scholar]
- 18.Rebek J. Chem Soc Rev. 1996;25:255–264. [Google Scholar]
- 19.Douglas T, Young M. Nature (London) 1998;393:152–155. [Google Scholar]
- 20.Dotan N, Arad D, Frolow F, Freeman A. Angew Chem Int Ed Engl. 1999;38:2363–2366. [PubMed] [Google Scholar]
- 21.Sára M, Sleytr U B. Micron. 1996;27:141–156. doi: 10.1016/0968-4328(96)80628-7. [DOI] [PubMed] [Google Scholar]
- 22.Zhang X J, Baase W A, Shoichet B K, Wilson K P, Matthews B W. Protein Eng. 1995;8:1017–1022. doi: 10.1093/protein/8.10.1017. [DOI] [PubMed] [Google Scholar]
- 23.Malakauskas S M, Mayo S L. Nat Struct Biol. 1998;5:470–475. doi: 10.1038/nsb0698-470. [DOI] [PubMed] [Google Scholar]
- 24.Wilcox S K, Putnam C D, Sastry M, Blankenship J, Chazin W J, McRee D E, Goodin D B. Biochemistry. 1998;37:16853–16862. doi: 10.1021/bi9815039. [DOI] [PubMed] [Google Scholar]
- 25.Wisz M S, Garrett C Z, Hellinga H W. Biochemistry. 1998;37:8269–8277. doi: 10.1021/bi980718f. [DOI] [PubMed] [Google Scholar]
- 26.Ogihara N L, Weiss M S, Degrado W F, Eisenberg D. Protein Sci. 1997;6:80–88. doi: 10.1002/pro.5560060109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Nautiyal S, Woolfson D N, King D S, Alber T. Biochemistry. 1995;34:11645–11651. doi: 10.1021/bi00037a001. [DOI] [PubMed] [Google Scholar]
- 28.Caspar D L D, Klug A. Cold Spring Harbor Symp Quant Biol. 1962;27:1–24. doi: 10.1101/sqb.1962.027.001.005. [DOI] [PubMed] [Google Scholar]
- 29.Smith C J, Pearse B M F. Trends Cell Biol. 1999;9:335–338. doi: 10.1016/s0962-8924(99)01631-1. [DOI] [PubMed] [Google Scholar]
- 30.Kelly J W. Curr Opin Struct Biol. 1996;6:11–17. doi: 10.1016/s0959-440x(96)80089-3. [DOI] [PubMed] [Google Scholar]