Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2016 Dec 8;113(52):15012–15017. doi: 10.1073/pnas.1600188113

Computational design of a homotrimeric metalloprotein with a trisbipyridyl core

Jeremy H Mills a,b,c,1, William Sheffler a,1, Maraia E Ener a,d, Patrick J Almhjell b,c, Gustav Oberdorfer a, José Henrique Pereira e, Fabio Parmeggiani a, Banumathi Sankaran f, Peter H Zwart f, David Baker a,g,2
PMCID: PMC5206526  PMID: 27940918

Significance

This article reports the computational design of a threefold symmetric, self-assembling protein homotrimer containing a highly stable noncanonical amino acid-mediated metal complex within the protein interface. To achieve this result, recently developed protein–protein interface design methods were extended to include a metal-chelating noncanonical amino acid containing a bipyridine functional group in the design process. Bipyridine metal complexes can give rise to photochemical properties that would be impossible to achieve with naturally occurring amino acids alone, suggesting that the methods reported here could be used to generate novel photoactive proteins.

Keywords: computational protein design, noncanonical amino acids, metalloproteins, protein self-assembly

Abstract

Metal-chelating heteroaryl small molecules have found widespread use as building blocks for coordination-driven, self-assembling nanostructures. The metal-chelating noncanonical amino acid (2,2′-bipyridin-5yl)alanine (Bpy-ala) could, in principle, be used to nucleate specific metalloprotein assemblies if introduced into proteins such that one assembly had much lower free energy than all alternatives. Here we describe the use of the Rosetta computational methodology to design a self-assembling homotrimeric protein with [Fe(Bpy-ala)3]2+ complexes at the interface between monomers. X-ray crystallographic analysis of the homotrimer showed that the design process had near-atomic-level accuracy: The all-atom rmsd between the design model and crystal structure for the residues at the protein interface is ∼1.4 Å. These results demonstrate that computational protein design together with genetically encoded noncanonical amino acids can be used to drive formation of precisely specified metal-mediated protein assemblies that could find use in a wide range of photophysical applications.


The small-molecule metal ligand 2,2′-bipyridine (Bpy) has found widespread use in inorganic chemistry because of its redox stability, ability to form high-affinity complexes of defined geometry with a variety of transition metals, and useful photochemical properties (1). The highly specific geometries adopted by Bpy in metal complexes have been used to generate coordination-driven self-assembling nanostructures with defined structures (2). For example, chemically synthesized, unstructured peptides containing the Bpy functional group as the side chain of an amino acid formed three-helix bundles upon addition of metals (3, 4). Recently, the ability to use Bpy in biological contexts was expanded through the addition of the noncanonical amino acid (NCAA) (2,2′-bipyridin-5yl)alanine (Bpy-ala) to the genome of Escherichia coli (5). Because Bpy forms very stable octahedral complexes with a number of biologically relevant divalent cations (e.g., Fe2+, Zn2+, Co2+, and Ni2+), an appropriately placed Bpy-ala residue could potentially be used to generate self-assembling proteins nucleated by [M(Bpy)3]2+ complexes (where M is a divalent cation that forms an octahedral complex with Bpy). Although we (6) and others (79) have engineered proteins in which Bpy-ala served in structural or functional capacities, to our knowledge, the possibility of using this NCAA to drive or stabilize the formation of a protein complex has not been explored.

The strategy of using metal ions to mediate protein-complex formation has its origins in naturally occurring proteins, such as hexameric proinsulin, in which complex formation is regulated by bound Ca2+ and Zn2+ ions (10). Recently, the Tezcan (11) and Kuhlman (12) groups used rational and computational protein design methods, respectively, to engineer metal-dependent protein–protein interactions using conserved dihistidine metal-binding motifs from known metalloproteins. In each case, structural analysis of the engineered protein complexes suggested that, although metal-dependent assembly had been achieved, the metal-binding sites had not formed as desired (11, 12). Although notable successes, these studies highlight the difficulty in precisely stabilizing the conformations of multiple amino acid side chains at once. We reasoned that an appropriately placed, genetically encoded Bpy-ala residue could nucleate a threefold symmetric, homotrimeric metalloprotein while overcoming some of the difficulties faced in the previous studies.

Here, we report the use of the Rosetta computational protein design methodology (13) to engineer a homotrimeric protein containing an octahedral [Fe(Bpy-ala)3]2+ complex. Crystallographic analysis indicated that near-atomic-level accuracy between the design model and resulting protein was achieved in the vicinity of the protein–protein interface. The methods developed in this study could be applied to the generation of novel protein therapeutics, protein-based materials, and metalloproteins with useful optical or photochemical properties.

Results

Computational Design Strategy.

A previous study in which Bpy-ala was used to engineer a novel metal-binding site in a protein scaffold highlighted important considerations for protein design efforts using this NCAA (6). In a design in which the Bpy-ala residue was not well constrained by packing interactions, we observed the formation of a highly stable [Fe(Bpy-alaf)2(Bpy-alap)]2+ complex (where Bpy-alaf is the free amino acid in the cell and Bpy-alap is NCAA, which has been incorporated in the protein) within the E. coli expression host. This observation was encouraging from the perspective that protein containing [Fe(Bpy-ala)3]2+ complexes can be purified intact from E. coli, but also suggested that for the present application, monomeric or dimeric forms of the desired trimeric protein could be kinetically trapped by free Bpy-ala within the cell. Furthermore, even if a [Fe(Bpy-alap)3]2+ complex is formed as desired, in the absence of a driving force other than the Bpy-ala complex formation, the subunit orientations in the resulting protein complexes would not likely be well defined.

To address these concerns, our computational design approach sought to achieve two distinct goals: identification of sites of Bpy-ala incorporation in starting scaffolds that are compatible with trimer formation and design of interactions between protein subunits to drive the formation of a particular trimeric orientation at the exclusion of others. Our starting scaffold set consisted of a set of nine de novo designed repeat proteins of known structure recently reported by our laboratory (14, 15). Because they are constructed from many identical subunits, the sizes of repeat proteins can be easily scaled in a defined manner, and the versatility in length afforded by repeat protein architectures could be useful for downstream applications.

Potential sites of Bpy-ala incorporation within a given starting scaffold were first identified by using Rosetta docking calculations (see Dataset S1 for full computational design methods). For each parent scaffold, three rotational and one translational degrees of freedom were sampled with a 2.5° covering radius to generate all possible threefold symmetric conformations in the context of polyalanine versions of the parent scaffolds. Successfully docked proteins contained geometrically complementary interfaces without steric clashes between the protein backbones of adjacent subunits.

We next searched for potential sites of octahedral [M(Bpy-ala)3]2+ complex incorporation within these trimeric scaffolds. Because previous results indicated that [Fe(Bpy-ala)3]2+ complexes can form within E. coli expression hosts (6), we chose to design our trimeric proteins in the context of these complexes. Octahedral Bpy complexes exist in two enantiomeric forms, Λ and Δ, and geometric constraints for each isomer were extracted from the Cambridge Structural Database (CSD ID KICRAR) and used to generate a [Fe(Bpy-ala)3]2+ complex with ideal geometries for further calculations (Fig. 1A). All symmetry-related Cβ atoms in a given docked configuration were then computationally scanned for the ability to overlap with the Cβs in Λ– and Δ–Bpy-ala complexes. Placed Bpy-ala complexes were removed from further consideration if the Cα–Cβ–Cγ angle differed by >3° from the value of 116° observed in the crystal structure of a Bpy-ala–containing protein [Protein Data Bank (PDB) ID code 4iww].

Fig. 1.

Fig. 1.

An overview of the computational design methods. (A) Octahedral [Fe(Bpy-ala)3]2+ complexes (yellow sticks, Λ and Δ isomers) were generated from small-molecule crystal structures and used as inputs for a docking algorithm that identified sites of incorporation in repeat protein scaffolds (gray cartoons) compatible with threefold symmetric protein complexes. (B) Successfully docked trimers (multicolored cartoons) were compatible with the geometries set by [Fe(Bpy-ala)3]2+ complexes and contained no steric clashes between protein subunits. (C) Computational interface design methods were used to engineer highly complementary surfaces between trimer subunits (light blue sticks) to drive the formation of a protein complex with a desired threefold symmetric orientation.

Docked trimer models containing a [Fe(Bpy-ala)3]2+ complex (Fig. 1B) were then subjected to iterative rounds of RosettaDesign (16) to generate a low-energy, well-packed interface between protein subunits (Fig. 1C). These designs were then filtered on the shape complementarity (17) between protein interfaces and the change in solvent-exposed surface area (ΔSASA) upon dissociation of the protein complex.

Initial Characterization of the Designed Proteins.

A total of seven designed proteins were generated through oligonucleotide mutagenesis of the parent scaffolds. An amber stop codon (TAG) was substituted for the wild-type codon at the desired site of Bpy-ala incorporation. pET21b expression plasmids encoding the designed proteins were cotransformed with the pEVOL-BpyRS plasmid (Schultz laboratory, The Scripps Research Institute) into a BL21(DE3) E. coli expression strain. The pEVOL-BpyRS plasmid contains an orthogonal tRNA/aminoacyl tRNA synthetase pair that enzymatically acylates a tRNA containing an anticodon loop specific to the amber codon with the Bpy-ala amino acid. Thus, full-length protein expression should only occur in the presence of the NCAA; in the absence of Bpy-ala in the expression medium, the amber stop codon should terminate translation. Expression trials for each of the designed proteins were therefore carried out in the presence and absence of Bpy-ala. Of the seven designs tested, six showed full-length protein expression only in the presence of the Bpy-ala amino acid (Fig. S1) and were subjected to additional characterization.

Fig. S1.

Fig. S1.

SDS/PAGE analysis of original trimer designs expression. The expression of the designed proteins (TRI_01–TRI_07) and the parent scaffolds (Ank_1–Ank_3) are shown. Above each lane in the designed proteins are a minus sign (indicating that Bpy-ala was not added during expression), a plus sign (indicating that Bpy-ala was added to the expression medium), or a Y (indicating suppression of the amber codon in the designed proteins with a suppressor tRNA charged with tyrosine, which served as an expression control).

Tris-Bpy complexes containing metals including Zn2+, Ni2+, or Fe2+ have characteristic spectroscopic signatures in the UV and visible wavelength ranges. Absorbance in the range of 290–330 nm corresponds to π–π* transitions within the Bpy ligand and is observed in complexes of Bpy with a number of biologically relevant metals (18). [Fe(Bpy)3]2+ complexes exhibit metal–ligand charge transfer (MLCT), which gives rise to characteristic absorption spectra in the visible wavelength range of 450–575 nm (19). To examine whether Bpy-ala–mediated complex formation had occurred, the absorbance of the designs was analyzed from 230 to 600 nm after nickel affinity purification. Two designed proteins (TRI_03 and TRI_05) showed absorbance in the range of 450–575 nm with λmax values of 490 and 525 nm, consistent with a [Fe(Bpy)3]2+ complex (Fig. 2 A and C) (19). Both TRI_03 and TRI_05 had as their parent scaffolds de novo-designed ankyrin-like repeat proteins (PDB ID codes 4gpm and 4hb5, respectively) (14). Purified designs were then subjected to analysis by size-exclusion chromatography (SEC) and were observed to elute at volumes indicative of proteins slightly larger than expected for the trimeric complexes relative to standards of known molecular weight (Fig. 2 B and D). This discrepancy in apparent size could be due to the unusually large Stokes radii of the trimeric complexes (20). Although both proteins formed some soluble aggregates (Fig. 2 B and D), TRI_03 appeared to be more prone to aggregation than TRI_05. Chiral Λ and Δ Bpy complexes give rise to characteristic circular dichroism (CD) signals in the near-UV. CD analysis of TRI_05 gave a spectrum characteristic of the Λ isomer (Fig. 2E) (19), which was consistent with the designed complex.

Fig. 2.

Fig. 2.

Experimental characterization of designs TRI_03 and TRI_05. (A, C, and E) Spectroscopic characterization of TRI_03 (A) and TRI_05 (C) and [Fe(Bpy-ala)3]2+ (E) in the range of 230–650 nm is shown. The MLCT regions of these spectra are enlarged in Insets. (B and D) SEC chromatograms of TRI_03 (B) and TRI_05 (D) are shown (solid lines) overlaid with traces of the monomeric parent scaffolds of TRI_03 and TRI_05 (PDB ID codes 4gpm and 4hb5, respectively; dashed lines). Soluble aggregates are observed for each protein at the column void volume of ∼8 mL. (F) Near-UV CD analysis of TRI_05 gave a spectrum consistent with the Λ isomer of a [Fe(Bpy)3]2+ complex.

To further characterize TRI_03 and TRI_05, the proteins were expressed and purified in large scale in preparation for crystallographic analysis. TRI_03 was observed to precipitate at high protein concentration, likely due to its propensity to aggregate. In contrast, TRI_05 was stable at high concentrations and was subjected to crystallographic trials.

Structural Characterization of TRI_05.

We solved the structure of TRI_05 to 2.2-Å resolution (see Table S1 for data collection and structural refinement statistics). The designed trimeric topology of TRI_05 was observed in the crystal structure, and electron density corresponding to a Tris–bipyridine metal complex was clearly visible at the trimeric interface (Fig. 3 A and B). The rmsd between the design model and the crystal structure is 1.4 Å for all atoms on the first two helices that participate directly in the interface. When superimposed on the interface residues, global deviations between the design model and structure are observed in the C-terminal repeats, likely due to slight deviations in the vicinity of the designed interface that propagate into much larger differences in the remainder of the protein. A global fit of all backbone atoms in the design to the structure gives a rmsd of 2.5 Å. Very good agreement between the design model and the structure was observed for the side chains in the interface, with the exception of Ile-14, Leu-38, and Met-46. Ile-14 adopts different rotamers in the design model and the solved structure (Fig. 3C). Larger deviations are observed for Leu-38 and Met-46, due to an unanticipated interaction between Met-46 and Leu-26 that forces Leu-38 into a rotameric state not present in the design model.

Table S1.

Data collection and refinement statistics

Feature Value(s)
Resolution range 28.92–2.25 (2.33–2.25)
Space group P 1 21 1
Unit cell 63.139 63.951 64.841 90 116.857 90
Total reflections 169,819 (12,418)
Unique reflections 22,017 (2,144)
Multiplicity 7.7 (5.7)
Completeness, % 100 (100)
Mean I/sigma(I) 13.93 (1.22)
Wilson B-factor 39.49
R merge 0.1325 (2.059)
R meas 0.1423 (2.285)
CC1/2 0.997 (0.416)
CC* 0.999 (0.767)
Reflections used in refinement 21,967 (2,143)
Reflections used for R free 2,202 (219)
R work 0.2139 (0.4562)
R free 0.2461 (0.5277)
CC (work) 0.961 (0.571)
CC (free) 0.941 (0.411)
No. of nonhydrogen atoms 3,693
Macromolecules 3,536
Ligands 52
Protein residues 476
rms (bonds) 0.004
rms (angles) 0.77
Ramachandran favored, % 99
Ramachandran allowed, % 0.85
Ramachandran outliers, % 0
Rotamer outliers, % 0.26
Clashscore 1.53
Average B-factor 54.03
Macromolecules 54.32
Ligands 44.18
Solvent 49.03
No. of TLS groups 29

Statistics for the highest-resolution shell are shown in parentheses.

Fig. 3.

Fig. 3.

X-ray crystallographic analysis of TRI_05. (A and B) Electron density in the vicinity of the [Fe(Bpy-ala)3]2+ complex of TRI_05 is shown contoured to 1.5 σ. (C) An overlay of the design model (white) with the solved structure (blue) is shown in the vicinity of the designed interface. All designed residues in the interface are depicted as sticks. Residues whose side chains deviated from the designed conformation are labeled.

Transient Absorbance Analysis of TRI_05.

Femtosecond time-resolved spectroscopic methods can be used to probe the photophysical properties of complexes such as [Fe(Bpy)3]2+ (21). Photoexcitation of such complexes in the MLCT domain results in the transfer of an electron from the metal ion to the ligands, and the solvent environment can affect the evolution of the excited state (22). Because TRI_05 represents the first example of a protein containing a [Fe(Bpy-ala)3]2+ complex, we explored the possibility that the excited-state dynamics would be altered by encapsulation of the complex within the protein environment. We therefore carried out femtosecond time-resolved absorption spectroscopy on both TRI_05 and the free [Fe(Bpy-ala)3]2+ complex. Excitation at 440 nm resulted in a characteristic loss of absorbance (bleaching) in the MLCT range (450–700 nm; Fig. 4A). Recovery from this bleach was used to determine excited state lifetime. Overlays of the excited-state decays measured at 530 nm (corresponding to the wavelength with the largest ΔA) of TRI_05 and the free [Fe(Bpy-ala)3]2+ complex are essentially superimposable (Fig. 4B), and the excited-state lifetimes are identical within experimental error (Fig. S2). The excited-state lifetime of TRI_05 may be close to that of free [Fe(Bpy-ala)3]2+ because the [Fe(Bpy-ala)3]2+ complex is only partially buried within the protein interface and is solvent-accessible on one face (Fig. 3 A and B).

Fig. 4.

Fig. 4.

Transient absorbance analysis of TRI_05. (A) Bleaching of TRI_05 MLCT absorbance at seven distinct time delays after excitation at 440 nm. (B) Excited-state lifetime of TRI_05 (red line) and [Fe(Bpy-ala)3]2+ (blue line) complexes after excitation at 440 nm. Absorbance measurements were obtained at 530 nm.

Fig. S2.

Fig. S2.

Monoexponential fits of transient absorbance data of [Fe(Bpy-ala)3]2+ and TRI_05. Data collected for the free Bpy-ala in complex with Fe2+ and TRI_05 (red lines) along with the associated fits (black lines) are shown in A and B, respectively. Fits suggested lifetimes of 610 ps for [Fe(Bpy-ala)3]2+ (A) and 690 ps for TRI_05 (B). These lifetimes are essentially identical within error.

Metal Dependence of the Complex.

TRI_05 was purified from the expression host in complex with Fe2+. To analyze the dependence of the protein complex on the presence of bound metal, we attempted to remove the metal from the protein by adding an excess of the potent chelator 1,10-phenanthroline (Phen), which has an affinity for Fe2+ ∼4 orders of magnitude higher than Bpy (23). Incubation of TRI_05 with an excess of Phen at 25 °C did not result in rapid exchange of the two metal ligands. We reasoned that metal removal might be accelerated at increased temperatures and thus examined the thermal stability of TRI_05 using CD analysis (Fig. S3). Partial unfolding was observed with increasing temperatures [Fig. S3D; TRI_05 does not fully denature even at 95 °C (Fig. S3B)] and was reversed upon cooling to 25 °C (Fig. S3C). TRI_05 was incubated with an excess of Phen for 1 h at 65 °C, after which spectroscopic analysis indicated the presence of a [Fe(Phen)3]2+ complex, with no discernable signal indicative of [Fe(Bpy)3]2+ complex (Fig. S4 A and B). After removal of unbound Phen and [Fe(Phen)3]2+ complex, apo TRI_05 was found to elute from the SEC column at the same volume observed before metal removal (Fig. S4C), suggesting that formation of the trimeric complex does not depend on Fe2+. Eluted TRI_05 was analyzed spectroscopically at high concentration, and no signal in the range of 450–575 nm was observed, suggesting that the bound Fe2+ had been removed (Fig. S4D). To further confirm the removal of Fe2+, we carried out inductively coupled plasma mass spectroscopic (ICP-MS) analysis of TRI_05 before and after Fe2+ removal. Because an elevated Zn2+ concentration was observed in the apo sample (Table S2), we cannot rule out the possibility that apo TRI_05 scavenged Zn2+ to reform the trimer. However, the trimeric complex also formed when Bpy-ala was substituted by tyrosine at protein concentrations >75 µM (SI Materials and Methods and Fig. S4E), suggesting that complex formation does not require metal.

Fig. S3.

Fig. S3.

CD analysis of TRI_05 stability. (AC) CD of TRI_05 measured at 25 °C (A), 95 °C (B), and after cooling to 25 °C (C) are shown. (D) The change in CD signal at 220 nm with increasing temperature is shown.

Fig. S4.

Fig. S4.

Metal removal and concentration-dependent oligomerization of TRI_05. (A) Spectroscopic analysis of [Fe(phen)3]2+ (red line) and [Fe(Bpy-ala)3]2+ (blue line) complexes in the near-UV and visible range (300–600 nm) are shown. (B) Spectroscopic analysis of TRI_05 in the near-UV and visible range is shown before (blue) and after (red) incubation of the protein with Phen at 65 °C. The change in spectral signature in this range suggests the removal of Fe2+ from TRI_05 by Phen. (C) SEC analysis of apo TRI_5 (blue line) is shown in comparison to TRI_05 containing Fe2+ (red line) and the parent scaffold (black line). The presence of a shoulder on the apo TRI_05 trace suggests partial dissociation of the TRI_05 trimer upon metal removal. (D) Spectroscopic analysis of apo TRI_05. Absorbance is shown in the range of 230–600 nm. An absorbance maximum at 280 nm was observed, which was blue-shifted relative to the spectrum of TRI_05 bound to Fe2+. (D, Inset) The MLCT absorbance range is shown. The lack of any signal from 420 to 600 nm suggests that the Fe2+ has been removed from the protein complex. (E) Comparison of TRI_05 (red trace) with a mutant in which in which Bpy-ala was replaced with tyrosine (blue trace). Analysis was carried out on a Superdex 75 5/150 analytical column. To further confirm the observation, a comparison with protein standards of known size is shown (black trace). The first peak in the protein standards triplet is BSA with a molecular weight of 66,000 g/mol. (F) The concentration dependence of complex formation was examined. Apo TRI_05 was concentrated to 750 µM and analyzed by SEC (black trace). Fractions collected from this run were then diluted to concentrations of 75 µM (blue trace) and 35 µM (red trace) subjected to gel filtration. The elution volume of the major peak shifts from 14.6 mL in the most concentrated sample to 16.0 mL in the intermediate concentration to 16.6 mL in the least concentrated form, suggesting a concentration dependence of trimer formation. (G) Spectral analysis of apo TRI_05 before (blue line) and after (red line) readdition of Fe2+. (H) SEC analysis of apo TRI_05 before (blue trace) and after (red trace) readdition of Fe2+.

Table S2.

ICP-MS analysis of TRI_05 and apo TRI_05

Sample Fe, ng Zn, ng
Buffer 11.35 ND
apo_TRI_05 86.75 161.82
TRI_05_wt 982.42 14.96

All metal amounts are in nanograms. ND, not detected.

We then examined the ability to reform the [Fe(Bpy-ala)3]2+ complex through addition of Fe2+ to solutions of apo TRI_05. At concentrations <35 µM, apo TRI_05 eluted from the SEC column at a size indicative of a monomer (Fig. S4F). Apo TRI_05 was diluted to a concentration of 2.5 µM and was reconcentrated after incubation with 10 µM ferrous ammonium sulfate via ultrafiltration. The concentrated protein had an absorbance spectrum consistent with the original TRI_05 (Fig. S4G), and analysis by SEC suggested that the complex had partially reformed, whereas apo TRI_05 subjected to the same conditions in the absence of Fe2+ remained a monomer (Fig. S4H).

SI Materials and Methods

Protein Expression and Purification Protocol.

Analytical protein expression and purification.

Five colonies of BL21(DE3) cells (Life Technologies) cotransformed with a pET21 plasmid containing a designed protein and pEVOL-BpyRS plasmid were picked from LB-agar plates and used to inoculate a 2 mL culture in Terrific Broth (TB) containing ampicillin (100 µg/mL) and chloramphenicol (34 µg/mL) antibiotics. This culture was grown to saturation at 37 °C (∼16 h), at which time cells were pelleted through centrifugation and resuspended in 1 mL of fresh TB. A total of 100 µL of the resuspended cells were used to inoculate two 10-mL cultures in TB also supplemented with ampicillin and chloramphenicol at the above concentrations. The optical density of the 10-mL cultures was monitored until an OD600 of ∼0.75 was reached. The incubation temperature was dropped to 18 °C, and Bpy-ala was added to one of the two cultures to a final concentration of 150 µM. After ∼1 h of continued growth, l-arabinose and isopropyl β-d-1-thiogalactopyranoside were added to all cultures to final concentrations of 0.02% and 1 mM, respectively, regardless of the presence or absence of Bpy-ala in the culture. Cultures were incubated at 18 °C for ∼30 additional hours, at which time the cells were harvested via centrifugation.

Cell pellets were resuspended in buffer containing 50 mM Tris (pH 8.0), 300 mM NaCl, and 30 mM imidazole, which was supplemented with hen egg white lysozyme (HEWL; Sigma-Aldrich) and DNase I (Sigma-Aldrich), each at a concentration of 1 mg/mL, and cOmplete Ultra protease inhibitor (Roche), which was used as specified by the manufacturer. The resuspended cells were then transferred to 1.5-mL microfuge tubes and were lysed via sonic dismembration by using a microplate bath sonicator (Qsonica). Lysates were clarified by centrifugation at 20,817 × g for 15 min and loaded directly onto 150 µL of Ni–nitrilotriacetic acid immobilized metal ion affinity resin (Qiagen) equilibrated with the resuspension buffer. The resin was washed with 10 column volumes (CV) of the buffer above lacking the HEWL, DNase I, and protease inhibitor. To remove nonspecifically bound proteins, the resin was then washed with 15 CV of buffer containing 50 mM Tris (pH 8.0), 300 mM NaCl, and 70 mM imidazole and subsequently eluted with buffer containing 50 mM Tris (pH 8.0), 300 mM NaCl, and 250 mM imidazole. Bpy-ala–dependent expression of each designed protein was then analyzed by SDS/PAGE (Fig. S1).

Large-scale protein expression and purification of TRI_05 for crystallographic trials.

After identification of TRI_05 as a design of interest, large-scale (1-L) expressions were carried out as follows: The 10-mL starter cultures were grown to saturation and used to inoculate 1-L TB cultures in containing ampicillin and chloramphenicol at the aforementioned concentrations. After an OD600 of 0.75 was reached, Bpy-ala was added to all cultures to a final concentration of 150 µM, and the cultures were again cooled to 18 °C before the addition of l-arabinose and isopropyl β-d-1-thiogalactopyranoside to final concentrations of 0.02% and 1 mM, respectively. Protein expression was allowed to proceed at 18 °C for an additional 30 h, at which time the cells were harvested via centrifugation at 7,500 × g for 10 min.

Cell pellets were resuspended in 30 mL of buffer containing 50 mM Tris (pH 8.0), 1 M NaCl, and 30 mM imidazole supplemented with HEWL and DNase I at a concentration of 1 mg/mL, and 1 tablet of cOmplete Ultra protease inhibitor mixture (large size, appropriate for 50-mL samples). The resuspended cells were then lysed by sonic dismembration. Lysates were clarified by centrifugation at 22,789 × g for 15 min, and the supernatants were decanted into a new 50-mL conical centrifuge tube. Because previous CD analyses suggested that TRI_05 was stable up to 95 °C, lysates were then heated to 55 °C for 1 h. Heating the lysates resulted in the precipitation of many soluble host proteins, while leaving TRI_05 in solution. The protein precipitate was removed from solution by centrifugation at 38,724 × g for 45 min, after which the supernatant was further clarified by passage through a 0.45-µm syringe filter (Millipore).

A total of 50 mL of clarified TRI_05 lysates was then directly loaded onto a 5-mL HisTrap column (GE Life Sciences), and purification was carried out on an Äkta Pure FPLC system. The column was washed with the high-salt wash buffer described above until the absorbance at 280 nm reached the baseline value. Scouting experiments indicated that, at this point, essentially only TRI_05 remained on the column and that a gradient elution was not necessary. Thus, TRI_05 was subsequently eluted with 500 mM imidazole.

To prepare the protein for crystallographic trials, fractions from the initial HisTrap purification containing TRI_05 were pooled and concentrated via centrifugal ultrafiltration to a volume of 1–2 mL The concentrated proteins were then loaded onto a Superdex 75 SEC column (GE Life Sciences) equilibrated with 25 mM Tris (pH 8.0), and 150 mM NaCl. The SEC purification step served to further purify the TRI_05 protein, as well as exchange it into buffer conditions suitable for crystallographic trials. Fractions collected from the SEC run were concentrated to 15 mg/mL and subjected to crystallographic trials as described in the main text.

Expression of TRI_05 containing tyrosine in place of Bpy-ala.

BL21(DE3) cells were cotransformed with a pET21 plasmid containing TRI_05 and a pEVOL-tyrRS plasmid that suppresses amber stop codons with tyrosine rather than Bpy-ala. Protein expression and purification was carried out as described above. Purified tyrosine containing TRI_05 was subjected to SEC analysis under the same conditions as the Bpy-ala complex (Fig. S4E).

TA Methods.

Although there was significant variability in signal-to-noise ratios for samples collected on different days and under different conditions, there was no observable change to excited-state lifetime in the presence or absence of oxygen (Fig. S5) or in the presence or absence of sample stirring. For simplicity, most data were collected with samples open to atmosphere.

Fig. S5.

Fig. S5.

Transient absorbance of [Fe(Bpy-ala)3]2+ complexes collected in air and in an inert environment. Transient absorbance data collected of small-molecule [Fe(Bpy-ala)3]2+ complexes collected in air (red line) and under N2 (blue line) are shown.

The docking calculations were carried out with purpose-built code that is not currently included in the release versions of Rosetta. To obtain this code, or to request assistance with the use of the methods described above, please contact the corresponding author.

Discussion

From the perspective of design of functional metalloproteins, the use of the Bpy functional group in a biological context provides a number of advantages. The Bpy ligand provides two pyridyl nitrogens in an orientation compatible with metal binding and therefore has a higher inherent metal affinity than any naturally occurring amino acid. Bpy is also neutral and nonpolar, and hence it should be compatible with protein interfaces comprising mostly hydrophobic residues. The energies of formation of octahedral complexes of Bpy with biologically relevant metals range from −18.0 kcal/mol for Zn2+ to −27.5 kcal/mol for Ni2+ (values calculated from affinities reported in ref. 23), rivaling those of the tightest known protein–protein interactions (24). However, one potential drawback to the use of Bpy-ala in this manner is that these energetically favorable complexes form very rapidly and do not readily exchange at neutral pH (25). Thus, although genetically encoded Bpy-ala could potentially be used to nucleate a C-3 symmetric, homotrimeric protein complex around a bound metal, the potential to kinetically trap the protein complex as a monomer or dimer with free Bpy-ala or form complexes with unintended orientations must be avoided.

We addressed these issues by using the Rosetta computational protein design methodology to identify sites of Bpy-ala incorporation in protein scaffolds of known structure that were sterically compatible with complex formation and to design novel protein–protein interfaces that provide an additional driving force for the formation of the desired complexes at the exclusion of other, undesired states. Of the seven computationally designed proteins we characterized experimentally, two were observed to have spectroscopic signatures indicative of the formation of [Fe(Bpy-ala)3]2+ complexes. Determination of the structure of one of these designs, TRI_05, revealed a trimeric metalloprotein containing the desired [Fe(Bpy-ala)3]2+ complex; high similarity between the solved structure and the design model was observed in the vicinity of the trimeric interface. Removal of the bound Fe2+ ion from TRI_05 with the potent metal chelator Phen did not result in the dissolution of the trimeric protein complex into its component monomers (Fig. S4C). This result highlights the ability of computational protein interface design algorithms to engineer high-affinity interfaces between proteins that can be used to drive the formation of complexes with desired geometries.

The computational methods developed in the course of this study could be extended to generate other novel proteins with useful properties. The use of other metals or metal-chelating NCAAs [e.g., (8-hydroxyquino-lin-3yl)alanine (26)] could result in the formation of twofold symmetric protein–protein complexes with similar stabilities. The ability to site-specifically place metal ions within the context of protein complexes could enable the way to the development of new biomaterials with useful properties. Finally, we demonstrated the ability to remove and reintroduce the bound iron, suggesting the possibility of introducing photoactive metals such as Ru2+ or Os2+, paving the way for the development of proteins with photochemical properties not achievable by using exclusively naturally occurring amino acids.

Materials and Methods

Generation and Cloning of Designed Proteins.

Sequences of the designed proteins were backtranslated and optimized for expression in E. coli by using DNAWorks (27). DNAWorks also generates sets of overlapping oligonucleotides from which full-length genes can be generated through assembly PCR (27). These oligonucleotides were ordered from Integrated DNA Technologies and assembled through a two-step PCR. PC- assembled gene fragments were then cloned into a pET21 expression vector (Novogen) between NdeI and XhoI restriction sites. Sequences of the full-length genes were confirmed by Sanger sequencing.

Protein Expression and Purification.

Slightly different protocols were used for analytical and production-scale protein expressions. Detailed descriptions of each of these methods can be found in SI Materials and Methods; a general protein expression and purification protocol follows. Sequence-confirmed pET21 expression plasmids were cotransformed with the pEVOL-BpyRS plasmid into chemically competent BL21(DE3) cells (Life Technologies) and were selected on LB-agar plates containing ampicillin and chloramphenicol antibiotics. Colonies with resistance to both antibiotics were used to inoculate expression cultures in Terrific Broth supplemented with both antibiotics. Expression cultures were grown at 37 °C until an optical density at 600 nm value of ∼0.75 was reached. The temperature was dropped to 18 °C, and the Bpy-ala NCAA was added to a final concentration of 150 µM. Expression was induced through the addition of l-arabinose and isopropyl β-d-1-thiogalactopyranoside to final concentrations of 0.02% and 1 mM, respectively, and growth was allowed to continue at 18 °C for 30 h, at which time cells were harvested via centrifugation.

Cell pellets were lysed via sonic dismembration, and the lysates were clarified by centrifugation. The designed proteins were then purified by immobilized metal ion affinity chromatography on Ni–nitrilotriacetic acid resin (Qiagen), followed by SEC analysis on a Superdex 75 column (GE Biosciences). The SEC step served to further purify the protein and also indicated the size of the expressed proteins.

Crystallization of TRI_05.

TRI_05 was dialyzed against 25 mM Tris buffer (pH 8.0) containing 150 mM NaCl and concentrated via centrifugal ultrafiltration to a final concentration of 12 mg/mL. Concentrated TRI_05 was screened by using the sparse matrix method (28), with a Phoenix Robot and the following crystallization screens: Crystal Screen, SaltRx, PEG/Ion, Index, and PEGRx (Hampton Research) and Berkeley Screen (Lawrence Berkeley National Laboratory). Crystals of TRI_05 were found in the Berkeley Screen condition consisting of 0.1 M magnesium chloride, 0.1 M sodium acetate trihydrate (pH 4.5), and 30% (vol/vol) pentaerythritol propoxylate. TRI_05 crystals were obtained after 5 d by the sitting-drop vapor-diffusion method with the drops consisting of a mixture of 0.2 μL of protein solution and 0.2 μL of reservoir solution.

Metal Removal from TRI_05.

TRI_05 at concentrations between 50 and 75 µM and in buffer containing 50 mM Tris (pH 7.6) and 150 mM NaCl was directly mixed with Phen to give a final chelator concentration of 1 mM. The protein and Phen mixture was then incubated at 65 °C for 1 h. To separate TRI_05 from free Phen and the [Fe(Phen)3]2+ complex generated during the incubation, the mixture was subjected to gel filtration immediately after the incubation was complete.

Readdition of Fe2+ to TRI_05.

Apo TRI_05 was diluted to a concentration of 2.5 µM in buffer containing 50 mM Tris (pH 7.6) and 150 mM NaCl, from which contaminating metals had been removed by treatment with Chelex 100 resin. Ferrous ammonium sulfate was added to the protein solution to a final concentration of 10 µM, and the mixture was applied to a Vivaspin 20 centrifugal concentrator (Sartorius). The protein was concentrated via centrifugation at 4,500 × g to a final volume of 0.5 mL and was immediately subjected to analysis by SEC.

ICP-MS Analysis of Proteins.

To prepare samples for ICP-MS analysis, TRI_05 bound to Fe2+ and apo TRI_05 were dialyzed extensively against 1.5 L of 50 mM Tris (pH 7.6) and 150 mM NaCl. In total, the buffer was changed five times. Before dialysis, beakers were soaked overnight in 2 mM EDTA in an effort to remove contaminating metals. Buffers were also treated with Chelex 100 (Bio-Rad) resin to remove exogenous metals. Samples were digested for 3 h in trace metal grade HNO3 and H2O2 with heating. After drying, samples were further digested overnight in concentrated HNO3 and HCl, and again were dried. The dried samples were reconstituted with 15 mL of dilute HNO3 and directly applied to the instrument. Analysis was carried out on an iCAP Q quadrupole Electron X-series ICP-MS instrument (Thermo) in kinetic energy discrimination mode.

X-Ray Diffraction Collection and Structural Solution of TRI_05.

X-ray diffraction data were obtained at the Advanced Light Source on beamlines 8.21 at wavelength 1.0 Å, and the initial data were processed with HKL2000 (29). Phase information was obtained with molecular replacement using PHASER (30); a single chain of the TRI_05 design in which all side chains were truncated at Cβ was used as a search model. Side chains were modeled by using AutoBuild (31) using the sequence of the designed protein in which Bpy-ala was mutated to alanine. Bpy-ala was substituted for alanine in the AutoBuild-generated model by using PyMol (Version 1.7.2.2; Schrödinger, LLC). The model containing Bpy-ala was used as an input for a round of Rosetta-Phenix refinement (32), after which iterative rounds of structural refinement (including addition of water molecules) were carried out with PHENIX (33).

Transient Absorption Studies.

For transient absorption (TA) studies, 50-fs laser pulses were generated by a Libra Ti:Sapphire laser system (Coherent). Approximately 75% (3 W) of the Ti:Sapphire output was used to pump an OPerA Solo Optical Parametric Amplifier (Coherent) to generate 440-nm excitation pulses, while the remainder (1 W) was reserved for visible probe generation. The excitation and probe beams were directed into a Helios TA spectrometer (Ultrafast Systems), where broadband, visible probe light was generated at a sapphire plate. TA spectra were collected in random order at 300 log-spaced time delays over the interval of ∼2 ps to 5 ns; spectra at each time point were averaged over three scans. Data were processed with Surface Explorer software (Ultrafast Systems) and plotted and fit to single-exponential decays by using Matlab 2014b curve fitting software (MathWorks). Spectra of TRI_05 and [Fe(Bpy-ala)3]2+ were collected at concentrations of 100 µM in 30 mM Tris (pH 8.0) and 150 mM NaCl in quartz cuvettes. An attempt was made to degas the TRI_05 sample, but resulted in precipitation of the protein. TRI_05 spectra were therefore collected in cuvettes open to air. Degassing of the [Fe(Bpy-ala)3]2+ complex was possible, and the effect of oxygen on the recorded spectra was examined. A [Fe(Bpy-ala)3]2+ sample was degassed before analysis via repeated rounds of evacuation of a stoppered cuvette followed by backfilling with N2. Data collected in the presence and absence of air were indistinguishable from one another (Fig. S5).

Supplementary Material

Supplementary File
pnas.1600188113.sd01.docx (28.7KB, docx)

Acknowledgments

We thank Peter Schultz for the generous gift of the pEVOL-BpyRS plasmid; Neil P. King for helpful discussions; Prof. Cody Schlenker (Office of Naval Research DURIP Grant N00014-14-1-0757) for access to the ultrafast TA laser system; Tim Pollock for experimental assistance; and Gwyneth Gordon and Trevor Martin for assistance with ICP-MS analysis. The Berkeley Center for Structural Biology is supported in part by the National Institutes of Health, National Institute of General Medical Sciences, and the Howard Hughes Medical Institute. The Advanced Light Source is supported by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy Contract DE-AC02-05CH11231. J.H.M. was supported by the National Institute of General Medical Science of the National Institutes of Health Award F32GM099210. D.B. and J.H.M. were supported by Defense Threat Reduction Agency Award HDTRA1-11-1-0041. M.E.E. was supported by the ACS Irving S. Sigal Postdoctoral Fellowship. F.P. was the recipient of Swiss National Science Foundation Postdoc Fellowship PBZHP3-125470 and Human Frontier Science Program Long-Term Fellowship LT000070/2009-L. G.O. is a Marie Curie International Outgoing Fellowship fellow (332094 ASR-CompEnzDes FP7-People-2012-IOF).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: The atomic coordinates of TRI_05 have been deposited in the Protein Data Bank, www.pdb.org (PDB ID code 5eil).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1600188113/-/DCSupplemental.

References

  • 1.Kaes C, Katz A, Hosseini MW. Bipyridine: The most widely used ligand. A review of molecules comprising at least two 2,2′-bipyridine units. Chem Rev. 2000;100(10):3553–3590. doi: 10.1021/cr990376z. [DOI] [PubMed] [Google Scholar]
  • 2.Leininger S, Olenyuk B, Stang PJ. Self-assembly of discrete cyclic nanostructures mediated by transition metals. Chem Rev. 2000;100(3):853–908. doi: 10.1021/cr9601324. [DOI] [PubMed] [Google Scholar]
  • 3.Lieberman M, Sasaki T. Iron(II) organizes a synthetic peptide into three-helix bundles. J Am Chem Soc. 1991;113(4):1470–1471. [Google Scholar]
  • 4.Ghadiri MR, Soares C, Choi C. A convergent approach to protein design. Metal ion-assisted spontaneous self-assembly of a polypeptide into a triple-helix bundle protein. J Am Chem Soc. 1992;114(3):825–831. [Google Scholar]
  • 5.Xie J, Liu W, Schultz PG. A genetically encoded bidentate, metal-binding amino acid. Angew Chem Int Ed Engl. 2007;46(48):9239–9242. doi: 10.1002/anie.200703397. [DOI] [PubMed] [Google Scholar]
  • 6.Mills JH, et al. Computational design of an unnatural amino acid dependent metalloprotein with atomic level accuracy. J Am Chem Soc. 2013;135(36):13393–13399. doi: 10.1021/ja403503m. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lee HS, Schultz PG. Biosynthesis of a site-specific DNA cleaving protein. J Am Chem Soc. 2008;130(40):13194–13195. doi: 10.1021/ja804653f. [DOI] [PubMed] [Google Scholar]
  • 8.Kang M, et al. Evolution of iron(II)-finger peptides by using a bipyridyl amino acid. ChemBioChem. 2014;15(6):822–825. doi: 10.1002/cbic.201300727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Luo X, Wang TA, Zhang Y, Wang F, Schultz PG. Stabilizing protein motifs with a genetically encoded metal-ion chelator. Cell Chem Biology. 2016;23(9):1098–1102. doi: 10.1016/j.chembiol.2016.08.007. [DOI] [PubMed] [Google Scholar]
  • 10.Dunn MF. Zinc-ligand interactions modulate assembly and stability of the insulin hexamer—a review. Biometals. 2005;18(4):295–303. doi: 10.1007/s10534-005-3685-y. [DOI] [PubMed] [Google Scholar]
  • 11.Salgado EN, Faraone-Mennella J, Tezcan FA. Controlling protein-protein interactions through metal coordination: Assembly of a 16-helix bundle protein. J Am Chem Soc. 2007;129(44):13374–13375. doi: 10.1021/ja075261o. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Der BS, et al. Metal-mediated affinity and orientation specificity in a computationally designed protein homodimer. J Am Chem Soc. 2012;134(1):375–385. doi: 10.1021/ja208015j. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Leaver-Fay A, et al. ROSETTA3: An object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 2011;487:545–574. doi: 10.1016/B978-0-12-381270-4.00019-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Parmeggiani F, et al. A general computational approach for repeat protein design. J Mol Biol. 2015;427(2):563–575. doi: 10.1016/j.jmb.2014.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Park K, Shen BW, Parmeggiani F, Huang PS. Control of repeat-protein curvature by computational protein design. Nat Struct Mol Biol. 2015;22(2):167–74. doi: 10.1038/nsmb.2938. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kuhlman B, Baker D. Native protein sequences are close to optimal for their structures. Proc Natl Acad Sci USA. 2000;97(19):10383–10388. doi: 10.1073/pnas.97.19.10383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Lawrence MC, Colman PM. Shape complementarity at protein/protein interfaces. J Mol Biol. 1993;234(4):946–950. doi: 10.1006/jmbi.1993.1648. [DOI] [PubMed] [Google Scholar]
  • 18.Meyer TJ. Photochemistry of metal coordination complexes: Metal to ligand charge transfer excited states. Pure Appl Chem. 1986;58(9) doi: 10.1351/pac198658091193. [DOI] [Google Scholar]
  • 19.Mason SF. The electronic spectra and optical activity of phenanthroline and dipyridyl metal complexes. Inorganica Chimica Acta Reviews. 1968;2:89–109. [Google Scholar]
  • 20.Hong P, Koza S, Bouvier ESP. Size-exclusion chromatography for the analysis of protein biotherapeutics and their aggregates. J Liq Chromatogr Relat Technol. 2012;35(20):2923–2950. doi: 10.1080/10826076.2012.743724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Creutz C, Chou M, Netzel TL, Okumura M, Sutin N. Lifetimes, spectra, and quenching of the excited states of polypyridine complexes of iron(II), ruthenium(II), and osmium(II) J Am Chem Soc. 1980;102(4):1309–1319. [Google Scholar]
  • 22.McCusker JK. Femtosecond absorption spectroscopy of transition metal charge-transfer complexes. Acc Chem Res. 2003;36(12):876–887. doi: 10.1021/ar030111d. [DOI] [PubMed] [Google Scholar]
  • 23.Martell AE, Smith RM. Critical Stability Constants. Springer; Boston: 1982. [Google Scholar]
  • 24.Baxendale JH, George P. The kinetics of formation and dissociation of the ferrous tris-dipyridyl ion. Trans Faraday Soc. 1950;46(0):736–739. [Google Scholar]
  • 25.Horton N, Lewis M. Calculation of the free energy of association for protein complexes. Protein Sci. 1992;1(1):169–181. doi: 10.1002/pro.5560010117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lee HS, Spraggon G, Schultz PG, Wang F. Genetic incorporation of a metal-ion chelating amino acid into proteins as a biophysical probe. J Am Chem Soc. 2009;131(7):2481–2483. doi: 10.1021/ja808340b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hoover DM, Lubkowski J. DNAWorks: An automated method for designing oligonucleotides for PCR-based gene synthesis. Nucleic Acids Res. 2002;30(10):e43. doi: 10.1093/nar/30.10.e43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Jancarik J, Kim SH. Sparse matrix sampling: A screening method for crystallization of proteins. J Appl Crystallogr. 1991;24(4):409–411. [Google Scholar]
  • 29.Otwinowski Z, Minor W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 1997;276:307–26. doi: 10.1016/S0076-6879(97)76066-X. [DOI] [PubMed] [Google Scholar]
  • 30.McCoy AJ, Grosse-Kunstleve RW. Phaser crystallographic software. J Appl Cryst. 2007;40:658–674. doi: 10.1107/S0021889807021206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Terwilliger TC, et al. Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard. Acta Crystallogr D Biol Crystallogr. 2008;64(Pt 1):61–69. doi: 10.1107/S090744490705024X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.DiMaio F, et al. Improved low-resolution crystallographic refinement with Phenix and Rosetta. Nat Methods. 2013;10(11):1102–1104. doi: 10.1038/nmeth.2648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Adams PD, et al. PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr. 2010;66(Pt 2):213–221. doi: 10.1107/S0907444909052925. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1600188113.sd01.docx (28.7KB, docx)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES