Abstract
The ability of proteins and other macromolecules to interact with inorganic surfaces is critical to biological function. The proteins involved in these interactions are highly charged and often rich in carboxylic acid side chains1-5, but the structures of most protein-inorganic interfaces are unknown. We explored the possibility of systematically designing structured protein-mineral interfaces guided by the example of ice-binding proteins, which present arrays of threonine residues matched to the ice lattice that order clathrate waters into an ice-like structure6. We designed proteins displaying arrays of up to 54 carboxylate residues geometrically matched to the K+ sublattice on muscovite mica (001). At low [K+] individual molecules bind independently to mica in the designed orientations, while at high [K+], the designs form 2D liquid-crystal phases, which accentuate the inherent structural bias in the muscovite lattice to produce protein arrays ordered over tens of millimeters. Incorporation of designed protein-protein interactions preserving the match between the proteins and the K+ lattice led to extended self-assembled structures on mica: designed end-to-end interactions produced micron long single protein-diameter wires, and a designed trimeric interface yielded extensive honeycomb arrays. The nearest neighbor distances in these hexagonal arrays could be set digitally between 7.5 and 15.9 nm with 2.1 nm selectivity by changing the number of repeat units in the monomer. These results demonstrate that protein-inorganic lattice interactions can be systematically programmed and set the stage for designing protein-inorganic hybrid materials.
Insight into protein-inorganic interfaces has come from studies of designed peptides that modulate calcite growth7, 8, additive control of crystallization9, designed helical peptides that assemble on carbon nanotubes10, and designed biphasic beta-sheet proteins11 and beta-sheet peptides12 on graphite surfaces. Designing assemblies of large proteins on inorganic lattices presents a new challenge as extensive spatial matching must exist within individual subunits and be maintained in the protein assembly. Collagen forms ordered arrays on mica13, but despite theoretical work14 the physical basis of the observed alignment remains unclear. The explicit programming by design of proteins to bind to inorganic lattices in pre-defined orientations, and assemble into larger scale arrays with different architectures, is a stringent test of our understanding of the principles of mineral-bound protein self-assembly.
Inspired by the lattice-matching of ice binding proteins to ice crystals, and the carboxyl rich nature of many proteins that interact with minerals, we explored the design of protein-inorganic interfaces based on the placement of carboxylate residues electrostatically and structurally matched to a crystalline inorganic surface. We chose muscovite mica as the model mineral system because it presents a well-defined K+ sublattice on the (001) cleavage plane (Fig. 1a, b, e, and Supplementary Fig. 1) onto which molecules and biomacromolecules have been shown to adsorb and, in some cases, assemble into ordered structures15-19. To achieve an extended geometrically matched binding interface, we sought a protein scaffold with a flat surface and a regularly repeating backbone with spacing equal to a multiple of the 5.2 Å nearest-neighbor distance between K+ sites. We found that the de novo designed helical repeat protein DHR10 satisfied these criteria20. To introduce a mica binding surface, one side of the protein was redesigned so that all residues were either glutamates lattice matched to mica K+ ions or alanines. We call this protein DHR10-micaX where X is the number of repeat subunits. The version with X=18, DHR10-mica18, was designed to have a large lattice-matched interface (1.8nm by 18.7nm) (Fig. 1a) and an exaggerated aspect ratio (3.6 nm wide, 20 nm long, and 2.5 nm high) to make the binding orientation of the protein readily detectable by atomic force microscopy (AFM). In Rosetta docking calculations, DHR10-mica18 exhibits a very strong preference for the designed binding orientation (Supplementary Fig. 2). DHR10-mica18 expressed in E. coli was alpha-helical (Supplementary Fig. 3), monomeric (Supplementary Figs. 4, 5a), had a small-angle X-ray scattering (SAXS) profile consistent with the design model (Supplementary Fig. 6)21,22, and showed directional adsorption in liquid AFM experiments on freshly cleaved mica in the presence of K+ consistent with the design scheme (Fig. 1c-e, Supplementary Figs. 7, 8).
The coverage and order of DHR10-mica18 on the surface increased with increasing salt concentration (Fig. 2a-c, Supplementary Figs. 9, 10a). In 10mM KCl DHR10-mica18 molecules adsorbed as stationary monomers (Supplementary Fig. 11a) with their repeat axes aligned to the three close-packed K+ directions (Figs. 1d, 2a, and Supplementary Figs. 7, 8). In 100mM KCl the proteins became highly mobile (Supplementary Fig. 11b) and assembled into a 2D liquid crystal-like phase with discrete co-aligned protein domains also aligned along the three close-packed K+ directions (Fig. 2b, Supplementary Fig. 12). In 3M KCl the protein mobility increased further (Supplementary Fig. 11c), and the arrays completely covered the surface with one lattice direction predominating over multi-millimeter distances (Fig. 2c, Supplementary Figs. 10b, 13 and Supplementary Table 1), corresponding to billions of co-aligned monomers. The long-range order likely accentuates a well-known structural anisotropy in the underlying muscovite crystal lattice23, adding to the natural entropy-driven tendency of nanorods to align24; on a truly threefold symmetric form of mica that lacks this anisotropy, domains of coaligned nanorods form with similar probabilities along the three K+ sublattice directions, even at 3M KCl (Fig. 2g). A redesign (DHR10-mica14-checker) which has the designed mica binding interface but considerably altered electrostatic patterning on its back face (Supplementary Fig. 14) forms ordered arrays on mica very similar to those of DHR10-mica18 (Fig. 2f, c, Supplementary Fig. 15 a-c), suggesting the designed interface, not the back face, directs mica binding.
To determine the length dependence of assembly, we took advantage of the modular nature of the repeat protein monomer and varied the number of repeat units. As expected for liquid crystal phases, the ability to order was strongly dependent on aspect ratio24. With 14 to 18 repeats units, high coverage and long-range order at high [KCl] were maintained (Fig. 2f, Supplementary Fig. 15e, f). With only 6 repeats, adsorbed protein was observed at 10mM but not at 100mM KCl or 3M KCl (Fig. 2d, e, and Supplementary Fig. 15d); in the latter cases the proteins likely transition to a 2D liquid phase, as expected for rod-shaped particles with low aspect ratios24, and move too rapidly to observe as stationary objects by conventional AFM.
We measured the mica coverage at 10mM KCl for DHR10-mica10 vs protein concentration and DHR10-micaX for X = 4, 6, 10 and 18 (Fig. 2h). The amount of bound protein scales exponentially with X, implying that the free energy of binding scales linearly with the number of repeats; the average binding energy per repeat is 3.0 kJ/mol at 10mM KCl.
We next sought to direct the pattern of the self-assembling monomers on mica by designing interfaces stabilizing particular monomer-monomer arrangements. To maintain the overall alignment on mica, we restricted the designed interactions to those preserving, in the resulting assembly, the geometric match between the structure of each individual monomer and the mica lattice (Fig. 3a, b). We first explored the increase in effective nanorod length through design of end-to-end hydrophobic interactions between ‘non-capped’ monomers (Fig. 3a-c). Because the interactions are relatively weak, in solution the capless protein does not assemble into fibers (see Methods, Supplementary Fig. 4) and was monomeric as assayed by native mass spectrometry (Supplementary Fig. 16)25,26.
On mica, the DHR10-mica18-NC (Non-Capped) proteins form extensive single protein-diameter wires (Fig. 3d, g). This assembly was maintained for much shorter 6 and 2 repeat constructs (Fig. 3e, f, h, i). DHR10-mica6-NC formed micron length nanowires (Supplementary Fig. 17) while the capped version of DHR10-mica6 did not assemble (Fig. 2e), demonstrating the importance of the designed protein-protein interactions. The long-range order of the nanowire arrays increased with [KCl]: the wires were straighter and better aligned to the lattice, with the effect more pronounced for shorter monomers (Supplementary Fig. 18), likely due to the increased mobility and more favorable lateral interactions between the proteins due to electrostatic screening. Even at 3M KCl, some domains remain aligned to one of the other two lattice directions on muscovite mica (Fig. 3g, h). All three orientations were observed for threefold symmetric mica (Supplementary Fig. 19). The misalignment at low [KCl] and alternative domain orientations at high [KCl] likely reflect kinetic trapping27; the net affinity of the long fibers to the mica surface is greater than for individual DHR10-mica18 monomers.
Next, we explored designing protein-protein interfaces that direct formation of symmetric assemblies spatially matched with the hexagonal K+ sublattice. Interfaces with C2, C3 or C6 point symmetry are compatible with the surface symmetry; packing constraints preclude C6 arrangements of the monomers, and C2 arrangements would be difficult to distinguish from the head to tail arrangements of the monomers (as in Fig. 2 and 3), so we chose to design a C3 interface. In symmetric protein oligomer design, continuous sampling along angular and translational degrees of freedom is generally carried out to find an subunit arrangement that can accommodate a designed interface28, but in this case the orientations of lattice-matched proteins are constrained to 5.2Å*n translations (n is any integer) along the mica lattice and 120 and 240-degree rotations around lattice C3 symmetry axes. C3 lattice-matched docks were stabilized by a combination of backbone remodeling, sequence design (see Methods, Fig. 4c), and fusion to previous de novo designed trimeric helical bundles29. Six trimer designs were tested and found to form trimers to different extents on mica (Supplementary Fig. 20). DHR10-mica18-trimer-V1 had the largest fraction of correctly aligned trimers, but the majority of proteins still adsorbed as monomers.
To reduce the affinity between monomers and the mica lattice, we reduced the number of repeat units from 18 to 5, and to direct formation of a hexagonal lattice we incorporated a C2 interface at the exposed end of each trimer arm (Supplementary Fig. 21). At 3M KCl this design, DHR10-mica5-H (named H for Hexagonal), formed an extensive and regular hexagonal lattice on mica (Fig. 4k, p) more open than the original design model (Supplementary Fig. 21). Rosetta calculations (see Methods, Supplementary Fig. 22) identified a low energy lattice matched C2 arrangement of the DHR10-mica5-H trimers (Fig. 4a, b) that generates a honeycomb lattice very similar to that observed (Fig. 4f). At lower [KCl] the protein adsorbed but did not form the hexagonal network (Supplementary Fig. 23).
We sought to tune the spacing of the honeycomb lattice by taking advantage of the modular repeat architecture of DHR10. The individual repeat units each contribute 1.04 nm to the length of the monomer, and hence adding (or subtracting) a single repeat unit changes the hexagon side length by 2.08 nm. We generated a series of DHR10-micaX-H constructs with the N-terminal trimer and C-terminal dimer interfaces kept constant and the number of repeat units varied from 3 to 7. These proteins formed extensive and regular hexagonal lattices on mica up to several square microns with geometries consistent with the Rosetta models (compare Fig. 4d-h to Fig. 4i-m and Fig. 4n-r). All exhibited sharp 6-fold symmetric FFT patterns and the lattice parameters exhibited ratios of 3.0 to 4.0 to 4.8 to 6.1 to 6.9 for the 3, 4, 5, 6 and 7 repeat constructs respectively. The measured nearest-neighbor distances of the hexagonal arrays were within 5 to 15% of the corresponding Rosetta model values of 7.5, 9.6, 11.7, 13.8, and 15.9 nm, respectively (Supplementary Fig. 22). In the versions with 4 to 7 repeats, individual domains are orientationally aligned, even when not in contact, suggesting a shared lattice-match (Fig. 4j-m, Supplementary Fig. 24). The 3-repeat version forms domains in two different orientations (see Methods, Fig. 4i, Supplementary Fig. 25).
To test the importance of specific protein-mineral interactions beyond overall electrostatic attraction to the observed assembly geometries, we varied both the properties of the surface and those of the protein. Oriented binding of DHR10-mica18 in 10mM K+ was not observed to highly ordered pyrolytic graphite (HOPG) and molybdenum disulfide, six-fold symmetric hydrophobic substrates previously used to adsorb beta-sheet proteins11,30, or to negatively charged plasma treated HOPG (Fig. 2i, Supplementary Fig. 26). To test the importance of the designed sidechain interactions, we generated four DHR10-mica4-H variants with the same overall charge on the mica interacting surface, but with different residue placement (Supplementary Fig. 27). Although all variants retain some residual complementarity to the mica lattice because the backbone spacing is fixed, and the protein-protein interfaces were left unchanged, three of four such constant surface charge variants failed to form the honeycomb lattice (Supplementary Fig. 28), demonstrating that the specific positioning of the charged protein side-chains is critical to patterned assembly.
Our results highlight the subtle balance of forces governing the structure of protein-mineral composite systems. From the perspective of colloidal physics, our designed proteins may be viewed as highly-charged, high-aspect nanorods packing in two dimensions on the charged mica surface. The forces determining structure are primarily entropic packing effects, modulated by electrostatics, that drive ionic strength-dependent liquid crystal formation. From the atomic-scale, biochemical perspective, the nanorods are patterned to be lattice matched to the surface and the interactions between the rods are tuned by modulating the configuration of atoms on their surfaces by protein design. We are able to control some aspects of the assemblies by balancing the first set of contributions. Monomers that are too short, in the DHR10-mica2-NC and DHR10-mica6-NC nanowires (Fig. 3e, f) and the DHR10-mica3-H hexagonal lattice (Fig. 4i), form assemblies with lower orientational order than longer monomers because the penalty for misalignment is too small. When the interactions are too strong, as in the end-to-end assembling nanorods in Fig. 3, order is reduced because of kinetic trapping. Increasing the KCl concentration weakens protein-mica interactions, increasing mobility and lowering kinetic barriers to annealing into the lowest energy state.
Because of the control afforded by computational protein design, and the use of rigid, designed building blocks, the honeycomb lattices with tunable unit cell parameters in Fig. 4 may be the best structurally characterized peptide/protein-inorganic lattice hybrids created to date. These lattices enable the patterning on mica of nano-wells with precisely controlled sizes that may be useful for diagnostics, high throughput biochemistry, and other applications. More generally, our results should inform the design of new protein-mineral hybrid materials.
Methods
Designing and modeling a mineral-matched repeat protein.
Our strategy to design protein-mineral interfaces is to match functional groups on protein sidechains to corresponding groups on the mineral lattice. To do this, we attempt to satisfy three conditions. First, the protein should have a repeating structure with a distance between the repeating units that is an integer multiple of a lattice parameter of the mineral lattice. Second, multiple sidechains in one repeat unit should make favorable contacts with the mineral lattice. Third, the protein should be nearly flat, so that it can interact with a crystal face over an extended area.
Our method for fulfilling these conditions for a particular mineral lattice is as follows. First, we identify a mineral surface and a designed repeat protein with compatible repeat spacings. Second, we design or identify sidechains with functional groups capable of interacting with the lattice—at least 2-3 within each repeat unit—that interact with sites on the mineral surface. Third, we relax the protein with constraints to perfectly match the target lattice, and predict the lowest energy protein-mineral interface.
Mica (001) potassium sublattice.
The (001) cleavage plane of mica surface organizes a well-defined hexagonal K+ sublattice with a 5.2 Å lattice parameter (see figure below). The positive K+ layer was targeted instead of the negatively charged mica (001) surface so [K+] could be used to tune binding affinity and so that charge-complementary designs would avoid the solubility problems associated with proteins with excessively positive surfaces31.
Selecting DHR10 starting scaffold.
Design models of designed helical repeat proteins20 were examined to determine their capacity to geometrically match the mica lattice. The designed helical repeat protein DHR10 was selected because it has distance between repeats of approximately 10.4 Å (twice the mica lattice parameter 5.2 Å) and is very flat (lacking both curvature and twist). We hypothesized these features would let DHR10 geometrically match the mica (001) surface like antifreeze repeat proteins match ice6.
Code availability.
The Rosetta Macromolecular Modeling suite is available for non-commercial use at (https://www.rosettacommons.org). The specific Rosetta applications used were Rosetta Scripts32, Remodel33, and Pyrosetta34. Foldit35, a graphic user interface to Rosetta, was used as well. PyMOL36 was used to view the design models and prepare input files for Rosetta protocols.
A Github repository (https://github.com/pylesharley/DHR10micaX) contains the Rosetta Script protocols, input files, and python scripts used to model the protein assembles on mica, and corresponding README.txt files with instructions. In the following description specific protocols are referred to by number+letter combinations which correspond to directories in the DHR10micaX repository that contain them.
Representation of mica (001) K+ sublattice.
The mica (001) K+ sublattice was represented as 2269 K+ ions in a hexagonal grid with a 5.2 Å nearest-neighbor distance. The default Rosetta parameterization of K+ was used, (database/chemical/residue_type_sets/fa_standard/residue_types/metal_ions/K.params). This representation of K+ has a +1 electrostatic charge and Lennard-Jones properties from CHARMM2737. The coordinates of the K+ ions were fixed in place during modeling by preventing sampling across the jumps between them in Rosetta’s fold-tree representation of connectivity.
Designing DHR10-mica interface.
DHR10 was observed to contain 12 glutamates (residues 28, 35, 42, 78, 85, 92, 128, 135, 142, 178, 185, 192) whose C-alpha coordinates form a triangular grid with spacing of about 10.4 Å. To allow these glutamate residues to reach the mica (001) K+ sublattice surrounding residues were mutated to alanine with Rosetta Script protocol 1. We call versions of DHR10 with these mutations DHR10-mica4, where 4 refers to the 4 fifty-residue repeat units it contains. The amino-acid sequence of DHR10-mica4 consists of a unique N-terminal repeat, two identical internal repeats, and a unique C-terminal repeat. To cap the structure and improve solubility, the first and last repeats have additional hydrophilic residues at positions that are only solvent accessible because they are on the ends of the protein. Protein versions with different numbers of repeats are called DHR10-micaX, where X is the number of repeat units, and can be generated by changing the number of identical internal repeats.
Modeling lattice-matched repeat protein (DHR10-mica18).
To model a lattice-matched interface, we decided to use the eighteen-repeat DHR10-mica18 protein for three reasons. First, we predicted a more extensive interfaces would be more likely to bind along the lattice. Second, we wanted a protein with an exaggerated aspect ratio so that its binding orientation could be observed with AFM. Third, cloning a DNA construct encoding an eighteen-repeat version was relatively convenient (see cloning section below).
The DHR10-mica18 sequence was generated by expanding the number of internal repeats from 2 to 16. Backbone torsion angles, (phi, psi, and omega) from an internal repeat of the DHR10-mica4 model were copied and duplicated 18 times with Pyrosetta to create an eighteen-repeat backbone model (protocol 2a). The DHR10 model had minor structural differences between repeats and the backbone extrapolated from one repeat was not as flat as the starting model. Rosetta Scripts protocol 2b used monte-carlo sampling to find a set of 50-backbone torsion angles that, when repeated in tandem 18 times, produced a DHR10-mica18 model that is perfectly flat and has a repeat spacing of 10.4 Å (double the 5.2 Å spacing on mica). Three sets of constraints were used during sampling. First, alpha carbon (Cα) coordinate constraints based on residues 2 to 74 in the DHR10 crystal structure (pdb id: 5CWG). Second, 10.4 Å spaced Cα coordinate constraints that were extrapolated from residues 29, 36, and 43 in the crystal structure for lattice matching residues in DHR10-mica18 (28, 35, 42, 78, 85, 92, 128, 135, 142, 178, 185, 192, 228, 235, 242, 278, 285, 292, 328, 335, 342, 378, 385, 392, 428, 435, 442, 478, 485, 492, 528, 535, 542, 578, 585, 592, 628, 635, 642, 678, 685, 692, 728, 735, 742, 778, 785, 792, 828, 835, 842, 878, 885, 892). Third, 10.4 Å distance constraints between alpha-carbon atoms in adjacent repeats.
Modeling lattice-matching binding mode (DHR10-mica18).
DHR10-mica18 was docked onto a model of the potassium ion sublattice consisting of 2269 ion coordinates arranged in a hexagonal grid with 5.2 Å spacing. Rosetta Scripts docking protocol 3a was used. In each docking trajectory, the protein was randomly spun about the z-axis in a 60° window, translated a random distance (between 0 and 5.2 Å), and then minimized with a relax-protocol. During the relax step, side-chains of residues in and around the surface-matching interface were allowed to repack and the rigid-body orientation of the protein on the sublattice was minimized. Throughout the docking protocol the protein backbone conformation and the structure of the potassium ions sublattice were fixed. The docked models with the lowest R.E.U. (Rosetta Energy Units) scores were lattice-matched, meaning the protein’s repeat direction was aligned to the 5.2 Å lattice direction and identical interactions were seen between repeated sites in the protein and the sublattice.
Designing non-capped DHR10-micaX fibers.
For solubility, the original DHR monomers have polar residues at the exposed ends. To enable the same set of repeat unit-repeat unit interactions between as within the monomers, we replaced the polar, ‘capping’ residues with the non-polar residues at corresponding positions in the internal repeat units, producing a protein-protein interface that resembles the hydrophobic packing between repeat units (Fig. 3c). Because the repeat spacing is the same within and between monomers, the geometric match to the mica lattice is preserved across the monomer-monomer interface. To make the DHR10-mica18-NC model the sequences of the first and last repeats in DHR10-mica18 were changed to the sequence of the internal repeat with protocol 4. Models of DHR10-mica2-NC and DHR10-mica6-NC fibers were made by splitting 100 and 300 residue segments (respectively) of the DHR10-mica18-NC model into separate chains, repacking all sidechains, relaxing the backbone of 3 residues flanking the cut-point, and minimizing the rigid-body orientation of each protein chain on the K+ sublattice (protocols 5a and 5b).
Designing symmetric interfaces between lattice-matched proteins.
When designing symmetric oligomers with proteins lattice-matched to a symmetric substrate, the symmetry between the proteins must be compatible with the symmetry of the surface. This severely restricts oligomer sampling in two ways. First, rotations are limited to spins around symmetry axes on the substrate. Second, translations are restricted to lattice-matched slides along the substrate. These sampling restrictions present a challenge as protein oligomer design typically depends on fine sampling of rigid-body degrees of freedom.
To overcome this the monomer structure can be redesigned to fit in a specific oligomeric context. Reductive modifications (trimming a component to fit) are simpler than additive ones (e.g. designing a new domain). Removing residues from the protein termini is easy, while removing residues from the middle of the polypeptide-chain is harder because a new backbone connection between must be formed.
Designing DHR10-micaX trimer interface.
Designing the DHR10-micaX trimer involved generating trimer conformations that preserved each component’s register to the mica (001) K+ sublattice, removing residues from the N-terminus, and making amino-acid substitutions to form a protein-protein interface. The prominent N-terminal helix was trimmed of 1-12 residues to increase the number of non-clashing configurations (protocol 6a). Three distinguishable C3 axes that are compatible with the surface symmetry were identified (Supplementary Fig. 29). The trimmed N-termini of DHR10-micaX were translated across a hexagonal grid with 5.2 Å intervals along two lattice directions and then symmetrized around the three C3 axes.
The trimer conformations were generated with Rosetta Script protocol 6b, which also replaces bulky residues with alanine and scores the energy and DDG (delta-delta G; the energy difference between bound and unbound state) of potential docks. Positive DDGs reflect clashes and DDG=0 indicates the subunits are not in proximity, so only trimer configurations with negative DDGs were considered further. There were 104 potential trimers meeting these criteria, and one of these was selected for interface design based on visual examination in pymol. The selected dock was chosen based on the lack of buried polar backbone atoms, minimal perturbation of the termini (only 2 residues removed), and the presence of multiple alpha-helices close enough to form hydrophobic packing interactions.
The trimer interface was designed by combining amino-acid substitutions from Rosetta Scripts design protocols with human-defined substitutions designed with Pymol and Foldit. After deciding on a sequence, the final model of the interface was made with Rosetta Scripts protocol 6c. Design rationales for the seventeen trimer-interface substitutions included in the final sequence are as follows. The core of the interface consists of Phe18 and Phe24 from one chain packing into Val2, Phe35, Val39, Leu53 of a symmetric copy (K2V, K18F, T24F, R39V). Trp3 was included to sterically block alternative close-packing interfaces (i.e. implicit negative design) and to form an interfacial cation-pi interaction with Arg13 (E3W, K11R). Five alanine substitutions were made to sterically accommodate other interface substitutions or other symmetric units in the trimer (E7A, V14A, E25A, E29A, E36A). Three residues surrounding a triangular cavity in the trimer interface were mutated from arginine to other charged residues to form intermolecular salt-bridges (R6K, R28E, R32K). Two substitutions were made so the sequence of the first repeat better matches the internal repeats (I12E, E43K). Ser1 is the end of a Gly-Ser linker (GSGGS) connecting the protein to a N-terminal thrombin-cleavable His6-tag (E1S).
After repacking the designed trimer interface symmetrically, a five-repeat version of the trimer was placed on the K+ sublattice and relaxed with Rosetta Script protocol 6d that repacks rotamers and minimizes the rigid-body dock of each protein subunit. This is to confirm that the subunits remain in the designed conformation and in register with the surface when the subunits are allowed to move freely on the sublattice.
Designing DHR10-mica5-H tiles (inaccurate model).
After observing the behavior of DHR10-mica18-trimer V1 on mica (Supplementary Fig. 20) we noted that among the adsorbed proteins some trimers were observed, but monomers predominated. We hypothesized that this was because the monomers could pack closer together and cover more of the surface, so we set out to design a symmetric layer with better surface coverage. A version with fewer repeats, DHR10-mica5-trimer, was used because it tessellates better than the longer armed versions. A DHR10-mica5-trimer model was symmetrized around four sublayer-compatible C2 symmetry axes (Supplementary Fig. 29) and translated along the lattice to sample dimer configurations that, when combined with the existing trimer interface, form a 2D-layer with P6 symmetry that is compatible with the substrate lattice.
When choosing this C2 interface we prioritized full coverage of the surface, not a designable protein-protein interface as we did for the C3 interface. Dimer-of-trimers configurations were screened with protocol 7a. One of these was very closely packed, but residues 216-222 in the last repeat clashed with the first and second repeat. To excise this region and allow the dense arrangement the helices in the C-terminal repeat were shortened and a new loop connecting them was designed.
The loop was prototyped in Foldit35 before a final model was built with Remodel and relaxed with hydrogen bond constraints in Rosetta Scripts. In Foldit the preexisting loop was cut and the endpoints were trimmed and reconnected using the delete and wiggle tools with cut-point distance constraints. Loop residues were mutated to flexible residues (Gly214, Gly215, Ser216) and a helix-breaking proline was added (Pro217). A blueprint file describing the loop’s connectivity, secondary structure, and sequence was written for Rosetta Remodel. Remodel followed this blueprint to replace a 24-residue segment with a 17-residue segment via fragment insertion (protocol 7b). Other amino-acid substitutions were made at this time to prevent clashes in the context of the close-packed arrangement. Subsequently Rosetta Script protocol 7c was used to relax the region with sidechain backbone hydrogen bond constraints. We call this protein with the N-terminal trimer interface and modified C-terminal repeat DHR10-mica5H (H for hexagonal symmetry).
Two DHR10-mica5-H trimers docked in the close-packed C2 arrangement were used to generate a P6 symmetry definition file (protocols 7d and 7e). Rosetta Script protocol 7f symmetrically repacked rotamers and minimized the rigid-body orientation of subunits with constraints to prevent large movements to make a model of this ‘tiling’ layer.
Modeling DHR10-mica5-H honeycomb based on observed layer.
After AFM imagining of DHR10-mica5H we realized the tiling model was not accurate, rather a honeycomb-like layer was formed. It seemed the modified C-termini formed a C2 interface, pushing the C3 axes further apart than they were in the tiling model.
To model the C2 interface in the observed layers we resampled lattice-compatible dimer configurations between DHR10-mica5-H, this time with the C-terminal helices packed end-to-end (protocol 8a). These conformations were named by the translation applied prior to applying C2 symmetry (from a starting conformation designated 0.0X_0.0Y). Full-atom and polyalanine backbone scans together found nineteen lattice-matching dimers with a non-zero DDG below 1000 Rosetta Energy Units. These candidate dimer interfaces were evaluated by repacking the full sequence (which had already been designed in the context of the tiling layer) while allowing small (< 0.5 Å) movements across the C2 interface (protocol 8b).
The candidate C2 interface with the lowest full-atom DDG (9.10X_6.76Y; DDG = −21.2 REU) resembled the observed layers (Fig. 4 b, f, k, p) and has a nearest-neighbor distance of 11.7 nm between C3 axes (Supplementary Fig. 18). These interfaces form a P6 symmetric layer that we call the honeycomb model. Protocol 8d was used to repack and rigid-body minimize this layer with P6 symmetry. Six chains from the P6 layer were then relaxed on the K+ sublattice with protocol 8e to ensure each subunit preserved its lattice-match. Two other candidate interfaces had similar DDGs and also resembled the layers. Honeycomb models were made with these C2 interfaces as well: 10.40X_4.50Y (DDG = −18.5 REU, Nearest-neighbor distance=11.4 nm) and 11.70X_2.25Y (DDG = −19.3 REU, Nearest-neighbor distance=11.2 nm).
Modulating DHR10-micaX-H honeycomb pore size.
The DHR10-mica5H monomer model was modified into three, four, six, and seven repeat versions by deleting and inserting repeats in a list of its backbone torsion angles and its amino-acid sequence. As each repeat is 1.04 nm and two copies of the protein separate the trimer nodes the nearest-neighbor distance is increased by 1.04 nm per repeat addition and reduced by 1.04 nm per repeat subtraction. Pyrosetta protocol 9a generated backbone models from the torsion angles and sequence extrapolated from DHR10-mica5-H. The new DHR10-micaX-H monomers were superimposed with structurally analogous regions in the DHR10-mica5H-honeycomb model and slid with 10.4 Å translations to accommodate the varying number of repeats (protocol 9b). The resulting trimer-of-dimer structures were used to generate symmetry definition files for a Rosetta Scripts protocol to repack and minimize the dock in P6 symmetry with constraints to prevent large movements (protocols 9c & 9d). Finally, hexamers extracted from the low-energy P6 layers were relaxed with the K+ sublattice without constraints with protocol 9e.
Expressed protein sequences.
The designed protein sequences were expressed with a N-terminal addition containing a His6 tag and a thrombin protease cut site. The full amino-acid sequences of the proteins expressed are in Supplementary Table 2.
Plasmid design and synthesis.
Genes encoding shorter DHR10-micaX proteins (X < 10) were designed using the Codon-Scrambler web server version 1.038 and purchased cloned into pet21b from Genscript. Genes encoding longer proteins were cloned from a DHR10-mica4 encoding plasmid with recursive directional ligation by plasmid reconstruction39. DNA encoding the 2 internal repeats was recursively doubled encode the 4, 8, and 16 internal repeats of DHR10-mica6, DHR10-mica10, and DHR10-mica18, respectively. The construct encoding DHR10-mica14-checker was the longest version obtained during repeated attempts to clone an eighteen repeat checkered version, it may be the product of homologous recombination in E. coli.
Protein expression and purification.
Proteins were expressed in BLR(DE3) cells from Novagen using Studiers autoinduction media (M2) with 0.5 L cultures in 2 L flasks at 37°C for 24 hours. Cells were pelleted at 4000 g for 20 minutes, resuspended in 30 mL lysis buffer (20 mM Tris HCl pH 8, 150 mM NaCl, 30 mM Imidazole, 10 mM Lysozyme, 1 mM DNAse, 1 mM PMSF) and lysed with a microfluidizer (Microfluidics M110P) at 18K PSI. Lysate was clarified at 17,000 g and the soluble fraction batch bound to 1mL Ni-NTA resin (Qiagen) for an hour. Lystate and resin was transfer to a gravity column and washed with 20mL wash buffer three times (20 mM Tris HCl pH 8, 150 mM NaCl, 30 mM Imidazole) before eluting the target protein with 12 mL elution buffer (20 mM Tris HCl pH 8, 150 mM NaCl, 150 mM Imidazole). The eluate was dialyzed (3500 MWCO dialysis cassette Thermo) into 4 liters of TBS (20 mM Tris HCl pH 8, 150 mM NaCl). Thrombin (Novagen) was added to cleave the his-tag and the sample incubated at room temperature. After 4 hours Phenylmethylsulfonyl fluoride was added to a final concentration of 1 mM to inactive the thrombin. Sample was again incubated with to Ni-NTA resin to bind the cleaved His-tags and uncleaved His-tagged proteins. The flow-through fraction containing the cleaved proteins was concentrated in a 3000 MWCO centrifugal filter (Amicon Ultra-15).
Measuring protein concentration.
A Nanodrop 8000 spectrometer (Thermo Scientific) was used to measure the absorbance of 280 nm wavelength light in 2μl of protein samples. The concentration was determined from the measured absorbance at 280 nm and the calculated extinction coefficient following the Beer-Lambert law.
Size exclusion chromatography.
The concentrated, His-tag fused proteins were fractionated by size with an AKTA pure on a Superdex 200 Increase 10/300 GL column. Most samples were run twice with an intervening 24-36 hour storage period. A TBS running buffer (150mM NaCl and 20mM Tris pH 8) buffer was used. The elution profiles of designs with protein-protein interfaces (DHR10-micaX-NC and DHR10-micaX-H) were compared to DHR10-micaX verisons to check their oligomeric state. DHR10-micaX-NC proteins eluted as a higher-order species of limited size which was collected for further characterization (and assembly on mica). For DHR10-micaX-H proteins a species that elutes like DHR10-micaX proteins was isolated and used in all subsequent experiments. SEC profiles of DHR10-micaX proteins of various X (# of repeats) with and without the Non-Capped (NC) and Honeycomb (H) protein-protein interfaces, are shown in Supplementary Fig. 4. In SEC experiments run in KCl buffer, the concentration of the protein in the peak was determined by measuring the A280 in the 0.5mL fraction that contains the peak.
Multi-angle light scattering.
The molecular weights of DHR10-micaX and DHR10-micaX-NC proteins were determined by multi-angle light scattering as described in Fallas et. al 201728. All measurements were taken in TBS buffer (150mM NaCl, 20mM Tris, pH 8).
Circular dichroism.
Circular dichroism spectra were measured with a AVIV Model 420 circular dichroism spectrometer. Samples were in 20mM NaPi pH 7 or 20mM Tris pH 8 buffers in a 1 mm cuvette. Units were converted to mean residue ellipticity (MRE) by dividing the raw spectra by N * C * L * 10, where N = number of residues, C = concentration protein, and L = pathlength (0.1 cm).
Small angle X-ray scattering.
Small-angle X-ray scattering (SAXS) structural data from DHR10-mica18 was collected at the SIBYLS High Throughput SAXS Advanced Light Source in Berkely, California21. Beam exposures of 0.3 seconds for 10.2 seconds resulting in 33 frames per sample. Data was collected at low (0.32 mg/mL, 3.4 μM) and high (1.8 mg/mL, 19.3 μM) protein concentrations in TBS buffer (150 mM NaCl and 20 mM Tris pH 8). To check for concentration dependent effects the averaged profiles from each concentration condition were used to calculate the radius-of-gyration (Rg) from a linear Guiner region with unbiased residuals with the ScÅtter Java application40. The averaged scattering profile from the high concentration was used in subsequent analyses. ScÅtter was used to determine the data resolution (qmax) as indicated by the linear region in the SIBYLS plot 41. The FoXS webserver22 was used to compare the experimental scattering profile to a profile computed from the design model and calculate chi2 and predicted Rg. The theoretical minimum Rg of unfolded DHR10-mica18 was calculated following Flory’s equation with values from chemically denatured proteins (Ro=1.9, ν=0.570)42.
Native mass spectrometry (MS).
Sample purity and integrity was analyzed by on-line buffer exchange MS using an UltiMate™ 3000 RSLC (Thermo Fisher Scientific) coupled to an Exactive Plus EMR Orbitrap instrument (Thermo Fisher Scientific) modified to incorporate a quadrupole mass filter and allow for surface-induced dissociation43. 40 pmole protein (5 μL of 8 μM protein in TBS) were injected and on-line buffer exchanged to 200 mM ammonium acetate, pH 6.8 (AmAc) by a self-packed buffer exchange column44 (P6 polyacrylamide gel, BioRad) at a flow-rate of 100 μL per min. Mass spectra were recorded for 1000 – 12000 m/z at 8750 resolution as defined at 200 m/z. The injection time was set to 200 ms. Voltages applied to the transfer optics were optimized to allow ion transmission while minimizing unintentional ion activation. Mass spectra were deconvoluted with UniDec version 2.6.545 using the following processing parameters: sample mass every 1 Da; peak FWHM 1 Thompson, Gaussian peak shape function. Organic source corrected average masses calculated with NIST Mass and Fragment Calculator v1.3246 from the His-tag-cleaved sequences were listed as the expected masses.
Atomic force microscopy.
The protein stock solution was diluted into desired concentration with incubation buffer. The incubation buffer contains 20 mM Tris-HCl (pH=7), 10 mM-3 M KCl based on the requirement. 100 μl diluted protein solution was dropped onto freshly cleaved substrates, mica (Ted Pella, CA), HOPG (Ted Pella, CA) w/o pre-60-second plasma treatment, molybdenum disulfide (Manchester Nanomaterials, UK), fluorophlogopite (SPI Supplies, PA), to incubate for 0.5-2 hours in sealed petri dish at room temperature. Mica surface was then rinsed by fresh incubation buffer, before the imaging, to remove un-adsorbed protein molecules. Tris-HCl buffer (pH=7, 1 M) and KCl were bought from Sigma-Aldrich. Nuclease-free water was bought from Ambion.
The images were captured by Cypher ES™ AFM (Asylum Research, CA) in aqueous. The imaging mode is amplitude modulation mode. The probe, OTR4-B and ORC-8-C (Olympus, Japan) were used. The imaging force was adjusted to minimize any interruption to the self-assembly. The offline AFM data processing was all done with software SPIP™ (Image Metrology, Denmark). The imaging buffer is 20 mM Tris-HCl (pH=7), 10 mM-3 M KCl based on the requirement.
The definition of the two-dimensional nematic-order parameter S.
Where θi is the angle of the ith molecules with the nematic director47.
Secondary structure characterization.
Circular dichroism spectra (Supplementary Fig. 3) of DHR10-micaX, DHR10-micaX-NC, and DHR10-micaX-H proteins show a characteristic alpha-helical signal with minima at 208nm and 222 nm and positive values below 202 nm.
Tertiary structure characterization.
Small-angle X-ray scattering profiles of DHR10-mica18 were measured at SIBYLS21 and averaged within each concentration condition. ScÅtter40 was used to determine radius-of-gyration values, Rg=53.5Å and Rg=49Å, from the low and high concentrations, respectively. The FoXS22 computed Rg of the design model was 56 Å. The averaged high concentration profiles have data resolution limited by qmax = 0.3548. The qmax trimmed dataset matched a profile FoXS computed from the design model with a chi2 = 2.88 (Supplementary Fig. 6).
Quaternary structure characterization.
DHR10-mica4, DHR10-mica10, and DHR10-mica18 have solution-state molecular weights as determined by MALS that are 0.97, 0.93, and 0.84 times their design value (Supplementary Fig. 4). DHR10-micaX proteins were used as monomeric references when evaluating the size-fractionation profiles of proteins with designed protein-protein interfaces (DHR10-micaX-NC and DHR10-micaX-H), an approach used by authors designing solution-state oligomers from other DHR proteins28. DHR10-mica18 remains soluble in tris buffer containing 3M KCl, and its SEC profile with 3M KCl (Supplementary Fig. 5a) resembles its profile with 150mM NaCl (Supplementary Fig. 4).
When fractionated by size during SEC purification, DHR10-micaX-NC proteins elute as a larger species than DHR10-micaX proteins of the same X (Supplementary Fig. 4). Molecular weights of DHR10-mica2-NC, DHR10-mica6-NC, and DHR10-mica18-NC determined by MALS were 1.9, 2.1, and 1.6 times the MW of a monomer (Supplementary Fig. 4). These species were predominantly monomeric as assayed by native mass spec with some dimers of DHR10-mica6-NC also detected. (Extended Data File 2). DHR10-mica6-NC had a similar SEC profile when in solution with 100 mM KCl (Supplementary Fig. 5b) and 150 mM NaCl (Supplementary Fig. 4). When exchanged into buffer containing 3M KCl (and 20 mM Tris pH 8) samples of 16 μM DHR10-mica6NC and 5 μM DHR10-mica18NC precipitated out of solution, forming a film on the side of the centrifugal filter.
After lysis DHR10-micaX-H proteins tended to form soluble aggregates but we could isolate species with a size-fractionation profiles like DHR10-micaX proteins of corresponding X (Supplementary Fig. 4); these species showed no additional signs of aggregation during purification and were used in all subsequent experiments.
Discussion of solution behavior.
DHR10-micaX, DHR10-micaX-NC and DHR10-micaX-H proteins have alpha-helical circular dichroism spectra that resemble the spectra of DHR10, the designed helical repeat (DHR) protein they are based on20. They were purified with SEC in TBS buffer (150mM NaCl, 20mM Tris pH 8) and other standard purification protocols for soluble proteins.
We analyzed the solution-state SAXS scattering profile of DHR10-mica18 in two ways. First, we determined its radius of gyration, 53.5 Å or 49 Å (at ~3 μM and 20 μM respectively) was slightly smaller than the design model (Rg = 56 Å) and substantially smaller than value predicted by Flory’s equation49 for an unfolded 906 residue protein (Rg > 92 Å). Second, we found the measured SAXS profile (Supplementary Fig. 6) matched a profile computed from the DHR10-mica18 design model pdb (Fig. 1a) with chi2 = 2.88.
DHR10-micaX proteins are monomeric during purification in TBS as assayed by SEC and MALS (Supplementary Fig. 4). Their SEC elution volume shifts predictably based on their number of repeats, making them useful standards when looking at profiles of designs with protein-protein interfaces. Comparison of SEC profiles in Supplementary Figs. 4 and 5 indicates that DHR10-mica18 remains monomeric in solution with 3M KCl at greater protein concentration (3μM) than we used to incubate monolayers on mica (0.1μM) (Fig. 2c).
In DHR10-micaX-NC (non-capped) designs, the charged residues that ‘cap’ the ends of the repeat protein and prevent end-to-end association have been removed. Theoretically the interactions could extend indefinitely but the proteins elute from SEC as a higher-order species of limited size (Supplementary Fig. 4). Their peaks skew toward greater elution volumes (smaller species). MALS indicates they are dimers (Supplementary Fig. 4), but they are monomeric upon dilution as assayed by native-mass spectroscopy (Extended Data File 2). Together, these data suggest DHR10-micaX-NC proteins are purified as concentration-dependent oligomers in TBS. DHR10-mica6-NC has a similar size-fractionation profile during SEC with 100mM KCl (Supplementary Fig. 5b) at 19μM protein, a greater concentration than used to incubate the adsorbed layers shown in Figure 3E (0.3μM) in 100mM KCl buffer. DHR10-mica6-NC and DHR10-mica18-NC samples (at 16μM and 5μM protein respectively) precipitated out of solution containing 3M KCl.
Although the behavior of DHR10-micaX-H (honeycomb) proteins in solution was also affected by their protein-protein interfaces we were able to isolate species with SEC profiles that look monomeric (Supplementary Fig. 4). This is consistent with our observations of DHR10-mica5-H growing hexagonal domains on the substrate (Supplementary Fig. 24) and absorbing without forming the array at KCl concentrations with lower mobility on the surface (Supplementary Figs. 11, 23).
After observing the rod-like DHR10-mica18 proteins assembly into arrays on muscovite mica in solutions with 3M KCl (Fig. 2c), we wanted to try to modulate the structure of adsorbed monolayers with designed protein-protein interactions. Although designs with these interfaces (DHR10-micaX-NC and DHR10-micaX-H proteins) behaved differently in solution we were able to isolate oligomeric or monomeric species with standard techniques for soluble proteins. When exchanged into solutions containing 3M KCl they precipitate out of solution, but the decreasing solubility or ‘salting out’ of proteins with increasing electrolyte concentration is well known. We still wanted to incubate the monolayers on mica in 3M KCl because we had found earlier that mobility and order of our proteins when adsorbed on mica increases with [KCl] (Fig. 2a-c and Supplementary Fig. 11). To do this we purified the proteins in TBS, dialyzed them into 20mM Tris buffer with no salt, and finally diluted them into a solution containing 3M KCl and <1μM protein immediately before incubating on mica. With this procedure, we again saw that greater [KCl] increased alignment to the mica lattice for both the micaX-NC (Fig. 3d-i and Supplementary Fig. 18) and micaX-H proteins (Fig. 4k and Supplementary Fig. 23).
Further discussion of DHR10-micaX assembly with [K+].
The dramatic increase in mobility with increasing KCl concentrations may arise from competition for the protein carboxylates between solution K+ ions and K+ ions on the mica surface, as is seen with other cations48, along with development of a strongly-bound hydration layer49. That the mica surface is devoid of DHR10-mica6 proteins in 3M KCl (Fig. 2e) is unlikely, given the high coverage at 10mM (Fig. 2d), the increasing coverage seen for DHR10-mica14 (Fig. 2f and Supplementary Fig. 15e, f) and DHR10-mica18 at 3M KCl (Fig. 2a-c and Supplementary Fig. 9), and the fact that the mica lattice would be visible if it were protein free. More likely, the proteins transitioned to a 2D liquid phase, as expected for rod-shaped particles with low aspect ratios24, and move too rapidly to be observed as stationary objects by conventional AFM.
Further discussion of DHR10-micaX-H honeycomb assembly.
At 100mM KCl, DHR10-mica5-H adsorbed to mica but did not assemble into the lattice (Supplementary Fig. 23a) and even at 1.5M, the assembled network had many errors (Supplementary Fig. 23b). In-situ AFM of DHR10-mica5-H assembly in 3M KCl (Supplementary Fig. 24) showed the hexagonal domains growing on mica are orientationally aligned, even when not in contact, suggesting their co-alignment is caused by a shared lattice-match to the substrate.
Unlike DHR10-micaX-H with X=4, 5, 6, 7 repeat units which forms orientationally aligned domains on mica, DHR10-mica3-H domains aligns both to the K+ sublattice and at 30° to the sublattice (Supplementary Fig. 25); for X=3, the difference in the lattice match between the 9 carboxylates and the underlying K+ sites along the 30° orientation may be too small to inhibit hexagon formation.
Data availability.
Design models in pdb format are available on Github (https://github.com/pylesharley/DHR10micaX/). Source data for Supplementary Figs. 2-6 are provided with the paper. All other data not included in manuscript are available upon reasonable request to the authors.
Supplementary Material
Acknowledgments
We thank T. Brunette, P. Huang, F. Parmeggiani, Y. Hsia, W. Sheffler, T. Craven, S. Boyken, and Z. Chen, for helpful suggestions, D. Alonso, L. Goldschmidt, and P. Vecchiato for supporting computational resources, B. Legg and J. Tao for AFM support, B. Legg for providing the model of mica (001), L. Carter for SEC-MALS support, and F. Busch and V. Wysocki for native mass spectrometry support. Development of imaging protocols was supported by the Laboratory Directed Research and Development Office through the Materials Synthesis and Simulations Across Scales Initiative. AFM experiments on the DHR10-micaX and DHR10-micaX-NC were supported by US Department of Energy (DOE), Office of Basic Energy Sciences (BES), Biomolecular Materials Program (BMP) at Pacific Northwest National Laboratory (PNNL). AFM experiments on DHR10-micaX-H formation of hexagonal lattices and analysis of DHR10-micaX binding kinetics were performed at PNNL and supported by the US DOE BES Energy Frontier Research Center CSSAS (The Center for the Science of Synthesis Across Scales) located at University of Washington; Award Number DE-SC0019288. Design and Synthesis of DHR10-micaX-H and its variants were performed at UW and supported by the US DOE BES BMP (DE-SC0018940). H.P. was supported by Institute for Protein Design Materials Science Research Gift Fund and Michelson Medical Research Foundation, Protein Design Initiative Fund, and DOE Biomolecular Materials Program (DE-SC0018940). D.B. is funded by the Howard Hughes Medical Institute and Bruce and Jeannie Nordstrom / Patty and Jimmy Barrier Gift for the Institute for Protein Design Directors Fund. We thank the staff at the Advanced Light Source SIBYLS beamline at Lawrence Berkeley National Laboratory, including K. Burnett, G. Hura, and J. Tainer for the services provided through the mail-in SAXS program, which is supported by the DOE Office of Biological and Environmental Research Integrated Diffraction Analysis program DOE BER IDAT grant (DE-AC02-05CH11231) and NIGMS supported ALS-ENABLE (GM124169-01). PNNL is a multi-program national laboratory operated for Department of Energy by Battelle under Contract No. DE-AC05-76RL01830.
Footnotes
Supplementary Information is available in the online version of the paper.
The authors declare no competing interests.
References
- 1.Sodek J, Ganss B, McKee MD Osteopontin. Crit Rev. Oral Biol. Med 11, 279–303 (2000). [DOI] [PubMed] [Google Scholar]
- 2.Addadi L, Joester D, Nudelman F, Weiner S, Mollusk shell formation: a source of new concepts for understanding biomineralization processes. Chem. Eur. J 12, 980–987 (2006). [DOI] [PubMed] [Google Scholar]
- 3.Shaw WJ, Solid-state NMR studies of proteins immobilized on inorganic surfaces. Solid State Nucl. Mag. Res 70, 1–14 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Staniland SS & Rawlings AE, Crystallizing the function of the magnetosome membrane mineralization protein Mms6 Biochem. Soc. Trans 44, 883–890 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Fukushima T, et al. , The molecular basis for binding of an electron transfer protein to a metal oxide surface. J. Am. Chem. Soc 139, 12647–12654 (2017). [DOI] [PubMed] [Google Scholar]
- 6.Davies PL Ice-binding proteins: a remarkable diversity of structures for stopping and starting ice growth. Trends Biochem. Sci 39, 548–555 (2014). [DOI] [PubMed] [Google Scholar]
- 7.DeOliveira DB & Laursen RA Control of calcite crystal morphology by a peptide designed to bind to a specific surface. J. Am. Chem. Soc 119, 10627–10631 (1997). [Google Scholar]
- 8.Masica DL, Schrier SB, Specht EA & Gray JJ De novo design of peptide-calcite biomineralization systems. J. Am. Chem. Soc 132, 12252–12262 (2010). [DOI] [PubMed] [Google Scholar]
- 9.Song R & Cölfen H, Additive controlled crystallization. Cryst. Eng. Comm 13, 1249–1276 (2011). [Google Scholar]
- 10.Grigoryan G et al. , Computational design of virus-like protein assemblies on carbon nanotube surfaces. Science. 332, 1071–1076 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Brown CL, Aksay IA, Saville DA & Hecht MH, Template-Directed Assembly of a de Novo Designed Protein. J. Am. Chem. Soc 124, 6846–6848 (2002). [DOI] [PubMed] [Google Scholar]
- 12.Mustata G-M et al. , Graphene Symmetry Amplified by Designed Peptide Self-Assembly. Biophys. J 110, 2507–2516 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Leow WW & Hwang W, Epitaxially Guided Assembly of Collagen Layers on Mica Surfaces. Langmuir. 27, 10907–10913 (2011). [DOI] [PubMed] [Google Scholar]
- 14.Tao J et al. , Energetic basis for the molecular-scale organization of bone. Proceedings of the National Academy of Sciences. 112, 326–331 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Akutagawa T et al. , Formation of oriented molecular nanowires on mica surface. Proceedings of the National Academy of Sciences. 99, 5028–5033 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Loo RW & Goh MC Potassium ion mediated collagen microfibril assembly on mica. Langmuir. 24, 13276–13278 (2008). [DOI] [PubMed] [Google Scholar]
- 17.Shin S-H et al. , Direct observation of kinetic traps associated with structural transformations leading to multiple pathways of S-layer assembly. Proc. Natl. Acad. Sci. U. S. A 109, 12968–12973 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Aghebat Rafat A, Pirzer T, Scheible MB, Kostina A & Simmel FC Surface-assisted large-scale ordering of DNA origami tiles. Angew. Chem. Int. Ed Engl 53, 7665–7668 (2014). [DOI] [PubMed] [Google Scholar]
- 19.Ma X et al. , Tuning crystallization pathways through sequence engineering of biomimetic polymers. Nat. Mater 16, 767–774 (2017). [DOI] [PubMed] [Google Scholar]
- 20.Brunette TJ et al. , Exploring the repeat protein universe through computational protein design. Nature. 528, 580–584 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Dyer KN et al. High-Throughput SAXS for the Characterization of Biomolecules in Solution: A Practical Approach. Methods Mol Biol. 1091, 245–258 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Schneidman-Duhovny D, Hammel M, Tainer JA,, and Sali A Accurate SAXS profile computation and its assessment by contrast variation experiments. Biophysical Journal. 105, 962–974 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kuwahara Y Comparison of the surface structure of the tetrahedral sheets of muscovite and phlogopite by AFM. Phys. Chem. Minerals 28, 1–8 (2001). [Google Scholar]
- 24.Boles MA, Engel M & Talapin DV Self-Assembly of Colloidal Nanocrystals: From Intricate Structures to Functional Materials. Chem. Rev 116, 11220–11289 (2016). [DOI] [PubMed] [Google Scholar]
- 25.Ruotolo BT & Robinson CV Aspects of native proteins are retained in vacuum. Curr. Opin. Chem. Biol 10, 402–408 (2006). [DOI] [PubMed] [Google Scholar]
- 26.Sahasrabuddhe A et al. Confirmation of intersubunit connectivity and topology of designed protein complexes by native MS. Proc. Natl Acad. Sci 115, 1268–1273 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Whitelam S et al. , Common Physical Framework Explains Phase Behavior and Dynamics of Atomic, Molecular, and Polymeric Network Formers. Physical Review X. 4 (2014). [Google Scholar]
- 28.Fallas JA et al. , Computational design of self-assembling cyclic protein homo-oligomers. Nature Chemistry 9, 353–360 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Boyken SE et al. , De novo design of protein homo-oligomers with modular hydrogen-bond network-mediated specificity. Science. 352, 680–687 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Wang J et al. , Differential modulating effect of MoS2 on amyloid peptide assemblies. Chem. Eur. J, 24, 3397–3402 (2018). [DOI] [PubMed] [Google Scholar]
- 31.Chan P, Curtis RA & Warwicker J Soluble expression of proteins correlates with a lack of positively-charged surface. Scientific Reports 3, 3333 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Fleishman SJ et al. RosettaScripts: A Scripting Language Interface to the Rosetta Macromolecular Modeling Suite PLoS One. 6, e20161 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Huang PS et al. RosettaRemodel: A Generalized Framework for Flexible Backbone Protein Design., PLoS One. 6, e24109 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Chaudhury S, Lyskov S, Gray JJ PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics 26, 689–691 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kleffner R, et al. Foldit Standalone: a video game-derived protein structure manipulation interface using Rosetta. Bioinformatics 33, 2765–2767 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.The PyMOL Molecular Graphics System, Version 2.1.1 Schrödinger, LLC. [Google Scholar]
- 37.MacKerell AD et al. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins, J. Phys. Chem. B 102, 3586–3616 (1998). [DOI] [PubMed] [Google Scholar]
- 38.Tang NC & Chilkoti A, Combinatorial codon scrambling enables scalable gene synthesis and amplification of repetitive proteins. Nat. Mater 15, 419–424 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.McDaniel JR, Mackay JA, Quiroz FG, & Chilkoti A Recursive directional ligation by plasmid reconstruction allows rapid and seamless cloning of oligomeric genes. Biomacromolecules. 11, 944–952 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Rambo RP, ScÅtter: Software for SAXS Analysis. Version 3.0 Developed at Diamond Light Source (Didcot, UK) and SIBYLS beamline (12.3.1) of the Advanced Light Source, Berkeley, CA: http://www.bioisis.net/tutorial/9 (2016). [Google Scholar]
- 41.Rambo RP & Tainer JA, Characterizing Flexible and Intrisically Unstructured Biological Macromoleucles by SAS Using the Porod-Debye Law. Biopolymers. 95, 559–571 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bernadó P & Svergun DI, Structural analysis of intrinsically disordered proteins by small-angle X-ray scattering. Mol. BioSyst 8, 151–167 (2012). [DOI] [PubMed] [Google Scholar]
- 43.VanAernum ZL, Gilbert JD, Belov ME, Makarov AA, Horning SR, Wysocki VH Surface-Induced Dissociation of Noncovalent Protein Complexes in an Extended Mass Range Orbitrap Mass Spectrometer. Anal. Chem 91, 3611–3618 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Waitt GM, Xu R, Wisely GB, Williams JD Automated in-line gel filtration for native state mass spectrometry. J. Am. Soc. Spectrom 19, 239–245 (2008). [DOI] [PubMed] [Google Scholar]
- 45.Marty MT, Baldwin AJ, Marklund EG, Hochberg GK, Benesch JL, Robinson CV Bayesian deconvolution of mass and ion mobility spectra: from binary interactions to polydisperse ensembles. Anal. Chem 87, 4370–4376 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kilpatrick EL, Liao WL, Camara JE, Turko IV, Bunk DM Expression and characterization of 15N-labeled human C-reactive protein in Escherichia coli and Pichia pastoris for use in isotope-dilution mass spectrometry. Protein Expr. Purif 85, 94–99 (2012). [DOI] [PubMed] [Google Scholar]
- 47.Frenkel D & Eppenga R Evidence for algebraic orientational order in a two-dimensional hard-core nematic. Phys. Rev. A 31, 1776–1787 (1985). [DOI] [PubMed] [Google Scholar]
- 48.Newcomb CJ, Qafoku NP, Grate JW, Bailey VL & De Yoreo JJ Developing a molecular picture of soil organic matter–mineral interactions by quantifying organo–mineral binding. Nat. Commun 8, 396, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Martin-Jimenez D, Chacon E, Tarazona P & Garcia R Atomically resolved three-dimensional structures of electrolyte aqueous solutions near a solid surface. Nat. Commun 7, 12164 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Design models in pdb format are available on Github (https://github.com/pylesharley/DHR10micaX/). Source data for Supplementary Figs. 2-6 are provided with the paper. All other data not included in manuscript are available upon reasonable request to the authors.