Abstract
While cryo-electron microscopy (cryo-EM) has revolutionized the structure determination of supramolecular protein complexes that are refractory to structure determination by X-ray crystallography, structure determination by cryo-EM can nonetheless be complicated by excessive conformational flexibility or structural heterogeneity resulting from weak or transient protein-protein association. Since such transient complexes are often critical for function, specialized approaches must be employed for the determination of meaningful structure-function relationships. Here, we outline examples in which transient protein-protein interactions have been visualized successfully by cryo-EM in the biosynthesis of fatty acids, polyketides, and terpenes. These studies demonstrate the utility of chemical crosslinking to stabilize transient protein-protein complexes for cryo-EM structural analysis, as well as the use of partial signal subtraction and localized reconstruction to extract useful structural information out of cryo-EM data collected from inherently dynamic systems. While these approaches do not always yield atomic resolution insights on protein-protein interactions, they nonetheless enable direct experimental observation of complexes in assembly-line biosynthesis that would otherwise be too fleeting for structural analysis.
1. Introduction
The “resolution revolution” in cryo-electron microscopy (cryo-EM) continues to yield improvements in resolution and particle size limitation due to significant advances in electron detectors, image processing algorithms, and software.1,2 These advances have facilitated the structure determination of large multiprotein complexes that are often refractory to structure determination by X-ray crystallography. Even so, cryo-EM structure determination can still be hindered by excessive conformational flexibility or structural heterogeneity resulting from transiently interacting protein-protein complexes. For example, natural products biosynthesis is often achieved by modular assembly-line complexes, components of which are often weakly associated in short-lived complexes that direct a precise sequence of enzyme-catalyzed reactions. Recent advances in cryo-EM sample preparation and image analysis (Figure 1) enable the visualization of such fleeting structures, even if not all insights are at atomic resolution.
Figure 1:

(A) In some proteins, a domain of interest adopts variable positions relative to the rest of the protein; the variably-positioned domain is averaged out in high-resolution reconstructions. (B) Chemical crosslinking can trap a protein domain in an associated position to yield a more uniform particle for structural characterization. (C) Partial signal subtraction removes density corresponding to the well-ordered portion of the protein, while masked classification bins particles according to the position of the domain of interest. Following masked classification, only particles with the domain of interest in a particular position are included in structural analysis of the complete protein. Similarly, localized reconstruction involves extraction of sub-particles exclusively containing the domain of interest. Sub-particles are refined separately and can be incorporated later to generate a reconstruction of the complete protein.
With regard to sample preparation, covalent crosslinking can be utilized to trap transiently associated protein-protein complexes for structure determination by cryo-EM. For example, glutaraldehyde and its polymeric dialdehydes can form imine linkages with lysine residues on adjacent protein surfaces, thus creating a covalent tether between neighboring proteins.3 Alternatively, specialized crosslinking reagents can be designed to target binding to two specific locations, e.g., nearby enzyme active sites. Either crosslinking strategy can be exploited to stabilize an otherwise transient protein-protein complex.
The technique of partial signal subtraction4 involves the use of masked three-dimensional (3D) classification in the analysis of a specific protein domain of interest observed to vary position in a multiprotein complex. Overlap of weak density for the specific protein domain of interest and strong density for the rest of the multiprotein complex can obfuscate classification and alignment. Partial signal subtraction mitigates this overlap by subtracting signal from each particle to highlight the signal for the specific protein domain of interest. Alignment-free masked classification can then be employed to sort particles for further analysis.
Localized reconstruction5 similarly allows for the analysis of conformationally variable or sub-stoichiometric complexes. Low-occupancy conformers or protein domains are identified and extracted from two-dimensional (2D) images as sub-particles, separated from bulk density. Classification and refinement of separate sub-particles enables alignment focusing on the domain or region of interest, which can reveal uncharacterized positions and orientations. In the remainder of this Review, we show how these techniques have been used to advance our understanding of transiently associated complexes in assembly-line biosynthesis as exemplified by fatty acid synthases, polyketide synthases, and terpene synthases.
2. Fatty acid synthases
In higher eukaryotes, fungi, and certain bacteria, fatty acids are generated by type-I fatty acid sythases.6,7 The type-I fatty acid synthase from yeast forms a 2.6 MDa hexamer with subunit stoichiometry α6β6 in which each subunit contains four functional domains (Figure 2A). In comparison, mycobacterial fatty acid synthases are 2.0 MDa hexamers with subunit stoichiometry α6. In each system, multidomain subunits assemble to form a supramolecular barrel that encapsulates six reaction chambers. Each “dome” of the barrel contains three reaction chambers. The acyl carrier protein (ACP) domain shuttles reaction intermediates through a well-defined sequence from one active site to another in each chamber.
Figure 2:

(A) Type-I fatty acid synthases from yeast (left) and mycobacteria (right) adopt supramolecular barrel structures containing reaction chambers in which the acyl carrier protein domain shuttles reaction intermediates from one active site to another. Abbreviations: ACP, acyl carrier protein; AT, acetyl transferase; DH, dehydratase; ER, enoyl reductase; KR, ketoreductase; KS, ketosynthase; MAT, malonyl-acetyl transferase; MPT, malonyl-palmitoyl transferase; PPT, phosphopantetheine transferase. Each barrel consists of top and bottom “domes”, each of which contains three reaction chambers. A bound ACP domain was modeled adjacent to the KR domain in the 4.0 Å-resolution crystal structure of the yeast fatty acid synthase at left but not in the 7.5 Å-resolution cryo-EM structure of the mycobacterial fatty acid synthase at right. Reprinted with permission from ref. 6. Copyright 2014 Elsevier. (B) Top-down cutaway view of the yeast fatty acid synthase barrel showing multiple positions for ACP binding near KS (cyan), AT (green), KR (blue), and ER (yellow) catalytic domains. Weaker, unmodeled density was also observed near the MTP (red) and DH (orange) domains. The three reaction chambers in the dome are separated by structural protein domains (green). Reprinted with permission from ref. 9. Copyright 2010 National Academy of Science. (C) Map of M. tuberculosis fatty acid synthase low-pass filtered to 15 Å resolution showing clear density for the ACP domain. The top view is comparable to that shown in (A) (right) rotated ~30° toward the viewer. (D) Four additional 3D classes showing distinct density for the ACP domain in the upper dome (orientation is comparable to that shown in (A) (right)). While each ACP orientation is distinct, all are in the vicinity of the KS domain. Reprinted with permission from ref. 10. Copyright 2018 The Authors.
The mobility of the ACP domain and its transient, stochastic association with multiple active sites challenges the structural analysis of type-I fatty acid synthases. Multiple positions for the ACP domain are typically observed in cryo-EM reconstructions, but they are often left unmodeled due to poor or missing density,8 or they are modeled in variable positions.9,10 For example, the 5.9 Å-resolution cryo-EM reconstruction of type-I yeast fatty acid synthase reveals density for the ACP bound in multiple locations with varying occupancies in the reaction chamber. Association of the ACP domain with different catalytic domains provides direct evidence for a substrate shuttling mechanism (Figure 2B).9
Recently, the 3.3 Å-resolution cryo-EM structure of the type-I fatty acid synthase from Mycobacterium tuberculosis was reported.10 The ACP domain was variably positioned and could not be modeled due to insufficient density in initial cryo-EM reconstructions. To probe possible states of the ACP domain, separate 3D classes corresponding to different putative ACP locations were refined and low-pass filtered at 15 Å-resolution. Clear density resulted for ACP domains in variable positions near the ketosynthase (KS) module within the reaction chamber (Figure 2C). The localized reconstruction method5 was used to extract specific ACP sub-particles classified into 3, 5, and 10 classes, which afforded density for variably positioned ACP domains (Figure 3). This approach yielded several new ACP positions and orientations even though improved map resolution was not achieved.
Figure 3:

Localized reconstruction of the ACP domain in M. tuberculosis fatty acid synthase reveals multiple orientations in the reaction chamber near the KS domain. Reprinted with permission from ref. 10. Copyright 2018 The Authors.
These structural studies reveal new functional insights despite ambiguities in the visualization of ACP domains. For example, mixed charge interfaces that can preferentially interact with negatively charged ACP differentiate mycobacterial fatty acid synthases from fungal or yeast fatty acid synthases. Thus, these approaches yield valuable functional insight even though structure determinations are not achieved at atomic resolution.
3. Polyketide synthases
Polyketide or non-ribosomal peptide biosynthesis is achieved by megasynthases similar to fatty acid synthases, in which multiple modules catalyze consecutive reactions. Each module contains at least two separate catalytic domains operating sequentially on an intermediate tethered to an ACP or peptidyl carrier protein (PCP) that transiently associates with biosynthetic modules; communication between the ACP/PCP and the catalytic domains is crucial for efficient biosynthesis.11,12 Intermodular communication between catalytic domains is also important, achieved by specific protein-protein interactions.13 Broad product chemodiversity in such systems is encoded in precisely directed sequences of protein-protein and domain-domain interactions.14,15 Characterization of type-I polyketide synthases remains a formidable challenge due to their massive size and inherent structural variability.
Polyketide synthase modules uniformly contain a condensing region and some contain a modifying region; ACPs carry tethered intermediates from one catalytic domain to the next.16 Condensing regions minimally consist of acyltransferase (AT) domains and ketosynthase (KS) domains that together catalyze the loading of extender units onto a growing polyketide chain. In iterative nonreducing polyketide synthases, a starter unit acyltransferase (SAT) within the condensing region is responsible for initial transfer of acyl units to the ACP, while a malonyl-CoA acyltransferase (MAT) charges the ACP with malonyl extender units following initial substrate loading of the KS domain. Nascent products from the condensing region are then carried to modifying regions containing a combination of KR, DH, and ER domains to catalyze consecutive reactions leading to the desired product.16
Dimeric domain architecture and interactions are conserved in nonreducing polyketide synthases, where the condensing and modifying regions are structurally segregated; the isolated SAT-KS-MAT condensing region is catalytically competent in the presence of ACP.17 Protein-protein interactions have a more substantial effect on biosynthetic fidelity than substrate-protein interactions.13 However, the transient nature of these interactions challenges their structural characterization.
Recently, the 7.1 Å-resolution cryo-EM structure of a full condensing region of the nonreducing polyketide synthase CTB1 trapped with substrate-loaded ACP was reported, thereby allowing for the characterization of ACP-KS interactions in a post-loading state (Figure 4).18 CTB1 catalyzes the iterative synthesis of nor-toralactone in the biosynthesis of the phytotoxin cercosporin. The cryo-EM structure of CTB1 consists of a (SAT-KS-MAT)2~ACP complex (~ denotes crosslinking). In addition to mechanism-based covalent crosslinking, the normally transient binding of substrate-loaded ACP to KS was stabilized by inactivation of the SAT and MAT domains. A single ACP is observed bound to the SAT-KS-MAT dimer. In this example, masked 3D classification and refinement following partial signal subtraction improved the resolution of density maps, but no significantly different features were observed. Nevertheless, this study demonstrates the utility of covalent crosslinking for stabilizing transiently associated protein-protein interactions for structure determination by cryo-EM.
Figure 4:

(A) Primary structure of the polyketide synthase CTB1 showing domain organization; the SAT-KS-MAT and ACP domains (full color) were used for cryo-EM structure determination (other domains are represented in faded colors). Multiple condensation cycles followed by cyclization and release catalyzed by the thioesterase (TE) domain yield nor-toralactone, further derivatization of which yields cercosporin. (B) The 7.1 Å-resolution density map of the (SAT-KS-MAT)2~ACP complex reveals density for a single covalently crosslinked ACP domain associated with the KS domain in “front” view; ACP density is absent from the KS domain in the “back” view. Reprinted with permission from ref. 18. Copyright 2018 Nature Publishing Group.
4. Terpene synthases
Terpenoid natural products owe much of their chemodiversity to terpene cyclases. The condensation of 5-carbon isoprenoids in chain elongation reactions catalyzed by prenyltransferases yields increasingly longer isoprenoid diphosphates, which in turn are converted into polycyclic terpenes in multi-step reactions catalyzed by terpene cyclases.19 Such cyclization reactions comprise the first committed steps of terpenoid biosynthesis and exemplify the most complex chemistry found in nature: in a single enzyme-catalyzed reaction, a cascade of multiple ring closures typically requires changes in bonding and hybridization for more than half of the carbon atoms of the substrate.
Some fungal terpenoid synthases harbor both prenyltransferase and cyclase activities within a single polypeptide chain, such that isoprenoid chain elongation and cyclization reactions occur in assembly-line fashion.20 These bifunctional enzymes are found as oligomers, namely hexamers and octamers, in which catalytic domains are connected by a flexible linker segment approximately 70–120 residues in length. While X-ray crystallography generally yields insight regarding terpene synthase structure-mechanism relationships,19 bifunctional assembly-line synthases are refractory to crystallization owing to disorder and variable domain positioning in heterogeneous oligomeric assemblies.21,22
Recently, cryo-EM has been utilized to characterize a bifunctional assembly-line terpene synthase, fusicoccadiene synthase from Phomopsis amygdali (PaFS) (Figure 5A).22 Initial single particle analysis of full-length PaFS indicated predominantly octameric particles. Surprisingly, however, the 3.99 Å-resolution reconstruction revealed only the octameric prenyltransferase (Figure 5B). The cyclase domain was invisible in reconstructions, yet it was confirmed to be present based on sample analysis by SDS-PAGE and steady-state kinetic measurements of cyclase activity. Analysis of raw negative-stain electron micrographs of native full-length PaFS revealed splayed-out satellites surrounding central doughnut-like particles, i.e., randomly positioned cyclase domains surrounding a prenyltransferase octamer (Figure 5C).
Figure 5:

(A) The prenyltransferase domain of fusicoccadiene synthase from P. amygdali (PaFS) catalyzes the condensation of one molecule of dimethylallyl diphosphate (DMAPP) and three molecules of isopentenyl diphosphate (IPP) to yield geranylgeranyl diphosphate (GGPP), which then undergoes cyclization in the cyclase domain to yield the tricyclic product. These comprise the first steps in the biosynthesis of Fusicoccin A. (B) The 3.99 Å-resolution cryo-EM reconstruction of native full-length PaFS reveals only the prenyltransferase domain assembled as a tetramer of A-B dimers (blue and cyan, respectively); density for the cyclase domains, which would be attached by flexible linkers to the N-termini of prenyltransferase domains, is absent. (C) Negative-stain EM image of native full-length PaFS reveals prenyltransferase octamers surrounded by cyclase domains splayed-out in random positions (highlighted in red circles). (D) Implementation of partial signal subtraction and masked classification with cryo-EM data collected from glutaraldehyde-crosslinked PaFS enabled the observation of densities corresponding to cyclase domains in three distinct positions at the side of the prenyltransferase octamer as well as a position that caps the central pore of the octamer. Reprinted with permission from ref. 22. Copyright 2021 The Authors.
Glutaraldehyde crosslinking and gradient ultracentrifugation was employed in sample preparation to ascertain whether the prenyltransferase and cyclase domains could be transiently associated. Partial signal subtraction, symmetry expansion, and masked classification was employed with cryo-EM data collected from crosslinked PaFS to yield a subset of particles in which cohesive density corresponding to a cyclase domain was observed capping the central pore of the prenyltransferase octamer (Figure 5D). This particle subset yielded an 11.9 Å-resolution reconstruction in which the capping density was readily fit with the previously determined21 crystal structure of the PaFS cyclase domain.
Cyclase domains were thought to adopt additional positions in this dataset, but a low number of particles required symmetry expansion about a central C4 axis to enhance the possible signals for associated cyclase domains. Three distinct positions for a cyclase domain on the side of the prenyltransferase octamer were identified upon partial signal subtraction and masked 3D classification, yielding particle subsets that were separately refined in reconstructions at 8.4–9.4 Å resolution (Figure 5D).
The combination of EM and cryo-EM data, and the corroboration of protein-protein contacts using mass spectrometry, suggests a model for PaFS in which cyclase domains are in constant motion about a central prenyltransferase octamer yet are capable of adopting transiently associated positions that may facilitate substrate channeling between active sites (Figure 6).22 Substrate transfer could occur when the cyclase domain is closely associated with the prenyltransferase core, e.g., when the cyclase domain adopts a position that caps the central cavity into which the product of the prenyltransferase domain, geranylgeranyl diphosphate, is released. Alternatively, geranylgeranyl diphosphate released from a prenyltransferase domain may encounter one of eight randomly positioned cyclase domains in a constantly moving cluster splayed out from the prenyltransferase core. As such, fusicoccadiene synthase may provide an example of dynamic cluster channeling.23,24 Further experiments will clarify the mode of substrate transfer between active sites in this system.
Figure 6:

Possible conformational states of full-length PaFS in which cyclase domains (green) can be associated with the central prenyltransferase octamer (blue) or randomly splayed out as connected by a flexible linker (red). GGPP is released into the central pore of the prenyltransferase octamer and transits to any one of the pendant cyclase domains to generate fusicoccadiene. Reprinted with permission from ref. 22. Copyright 2021 The Authors.
5. Conclusions
The studies summarized herein exemplify the utility of covalent crosslinking in stabilizing transiently associated protein-protein complexes, as well as the power of partial signal subtraction and localized reconstruction in extracting useful structural information out of inherently dynamic systems. While these approaches do not always yield atomic resolution insights on protein-protein interactions, they do enable direct experimental observation of protein-protein complexes that would otherwise be too fleeting for visualization by cryo-EM. These approaches have facilitated critical first steps toward understanding the structural basis of substrate flux in assembly-line biosynthesis. Moreover, these studies demonstrate how high-resolution X-ray crystallography and intermediate- to low-resolution cryo-EM complement each other to advance our understanding of catalysis in these complex systems.
Acknowledgements
This work was supported by NIH grant GM56838 to D.W.C. J.L.F. thanks the Structural Biology and Molecular Biophysics NIH Training Grant T32-GM008275 for support.
Footnotes
Declaration of Competing Interest
The authors declare that they have no known competing financial or personal relationships that could have appeared to influence the work reported in this paper.
References
- 1.Henderson R, 1995. The potential and limitations of neutrons, electrons and X-rays for atomic resolution microscopy of unstained biological molecules. Quart. Rev. Biophys 28,171–193. [DOI] [PubMed] [Google Scholar]
- 2.Wu M, Lander GC, 2020. How low can we go? Structure determination of small biological complexes using single-particle cryo-EM. Curr. Opin. Struct. Biol 64, 9–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Migneault I, Dartiguenave C, Bertrand MJ, Waldron KC, 2004. Glutaraldehyde: behavior in aqueous solution, reaction with proteins, and application to enzyme crosslinking. BioTechniques 37, 790–802. [DOI] [PubMed] [Google Scholar]
- 4.Bai XC, Rajendra E, Yang G, Shi Y, Scheres SH, 2015. Sampling the conformational space of the catalytic subunit of human γ-secretase. eLife 4, e11182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ilca SL, Kotecha A, Sun X, Poranen MM, Stuart DI, Huiskonen JT, 2015. Localized reconstruction of subunits from electron cryomicroscopy images of macromolecular complexes. Nat. Commun 6, 8843. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Grininger M, 2014. Perspectives on the evolution, assembly, and conformational dynamics of fatty acid synthase type I (FAS I) systems. Curr. Opin. Struct. Biol 25, 49–56. [DOI] [PubMed] [Google Scholar]
- 7.Herbst DA, Townsend CA, Maier T, 2018. The architectures of iterative type I PKS and FAS. Nat. Prod. Rep 35, 1046–1069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ciccarelli L, Connell SR, Enderle M, Mills DJ, Vonck J, Grininger M, 2013. Structure and conformational variability of the Mycobacterium tuberculosis fatty acid synthase multienzyme complex. Structure 21, 1251–125. [DOI] [PubMed] [Google Scholar]
- 9.Gipson P, Mills DJ, Wouts R, Grininger M, Vonck J, Kühlbrandt W, 2010. Direct structural insight into the substrate-shuttling mechanism of yeast fatty acid synthase by electron cryomicroscopy. Proc. Natl. Acad. Sci. U.S.A 107, 9164–9169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Elad N, Baron S, Peleg Y, Albeck S, Grunwald J, Raviv G, Shakked Z, Zimhony O, Diskin R, 2018. Structure of type-I Mycobacterium tuberculosis fatty acid synthase at 3.3 Å resolution. Nat. Commun 9, 3886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Robbins T, Liu YC, Cane DE, Khosla C, 2016. Structure and mechanism of assembly line polyketide synthases. Curr. Opin. Struct. Biol 41, 10–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Reimer JM, Haque AS, Tarry MJ, Schmeing TM, 2018. Piecing together nonribosomal peptide synthesis. Curr. Opin. Struct. Biol 49, 104–113. [DOI] [PubMed] [Google Scholar]
- 13.Klaus M, Ostrowski MP, Austerjost J, Robbins T, Lowry B, Cane DE, Khosla C, 2016. Protein-protein interactions, not substrate recognition, dominate the turnover of chimeric assembly line polyketide synthases. J. Biol. Chem 291, 16404–16415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Chen A, Re RN, Burkart MD, 2018. Type II fatty acid and polyketide synthases: deciphering protein–protein and protein–substrate interactions. Nat. Prod. Rep 35, 1029–1045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Keatinge-Clay AT, 2012. The structures of type I polyketide synthases. Nat. Prod. Rep 29, 1050–1073. [DOI] [PubMed] [Google Scholar]
- 16.Weissman KJ, 2015. The structural biology of biosynthetic megaenzymes. Nat. Chem. Biol 11, 660–670. [DOI] [PubMed] [Google Scholar]
- 17.Crawford JM, Thomas PM, Scheerer JR, Vagstad AL, Kelleher NL, Townsend CA, 2008. Deconstruction of iterative multidomain polyketide synthase function. Science 320, 243–246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Herbst DA, Huit-Roehl CR, Jakob RP, Kravetz JM, Storm PA, Alley JR, Townsend CA, Maier T, 2018. The structural organization of substrate loading in iterative polyketide synthases. Nat. Chem. Biol 14, 474–479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Christianson DW, 2017. Structural and chemical biology of terpenoid cyclases. Chem. Rev 117, 11570–11648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Faylo JL, Ronnebaum TA, and Christianson DW, 2021. Acc. Chem. Res 54, in press. 10.1021/acs.accounts.1c00296 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Chen M, Chou WKW, Toyomasu T, Cane DE, Christianson DW, 2016. ACS Chem. Biol 11, 889–899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Faylo JL, van Eeuwen T, Kim HJ Gorbea Colón JJ, Garcia BA, Murakami K, Christianson DW, 2021. Structural insights on assembly-line terpene biosynthesis. Nat. Commun 12, 3487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Castellana M, Wilson MZ, Xu Y, Joshi P, Cristea IM, Rabinowitz JD, Gitai Z, Wingreen NS, 2014. Enzyme clustering accelerates processing of intermediates through metabolic channeling. Nat. Biotechnol 32, 1011–1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sweetlove LJ, Fernie AR, 2018. The role of dynamic enzyme assemblies and substrate channelling in metabolic regulation. Nat. Commun 9, 2136. [DOI] [PMC free article] [PubMed] [Google Scholar]
