Abstract
Over the last 15 years, structural biology has seen unprecedented development and improvement in two areas: electron cryo-microscopy (cryo-EM) and predictive modeling. Once relegated to low resolutions, single-particle cryo-EM is now capable of achieving near-atomic resolutions of a wide variety of macromolecular complexes. Ushered in by AlphaFold, machine learning has powered the current generation of predictive modeling tools, which can accurately and reliably predict models for proteins and some complexes directly from the sequence alone. Although they offer new opportunities individually, there is an inherent synergy between these techniques, allowing for the construction of large, complex macromolecular models. Here, we give a brief overview of these approaches in addition to illustrating works that combine these techniques for model building. These examples provide insight into model building, assessment, and limitations when integrating predictive modeling with cryo-EM density maps. Together, these approaches offer the potential to greatly accelerate the generation of macromolecular structural insights, particularly when coupled with experimental data.
Significance
Advances in structure determination by electron cryo-microscopy and predictive modeling have revolutionized modern structural biology. This review examines the synergy between these two techniques, as well as their limitations, in the structural and functional characterization of large macromolecular assemblies.
Introduction
Determining the structure of macromolecules, i.e., how the atoms are arranged in space and how they change over time or in response to external factors, is essential to understanding their function and the development of new therapeutics (1,2). To this end, structural biology has produced incredible advances in our understanding and control of fundamental biological processes and treatment of human diseases.
Thanks in large part to the development of new hardware and software that has enabled structure determination at near-atomic resolutions, single-particle electron cryo-microscopy (cryo-EM) has become increasingly important in understanding the complex structure and function of macromolecular assemblies (3,4,5). Cryo-EM, with resolutions approaching 1Å (6,7,8), has resolved the structure-function details of many large assemblies, including severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (9,10,11), ribosomes (12,13,14), ion channels (15,16,17), and G-protein-coupled receptors (18,19,20), and has become critical for improving human health and medicine (21,22).
By 2008, the first four single-particle cryo-EM structures at resolutions better than 5 Å were published: rotavirus at 3.8-Å resolution (23), cytoplasmic polyhedrosis virus (CPV) at 3.88-Å resolution (24), bacteriophage ε15 at 4.5-Å resolution (25), and GroEL at 4-Å resolution (26). These density maps revealed unprecedented insights into their complex structure and clearly resolved the protein fold, secondary structure, and large, bulky sidechains of individual subunits within these macromolecular assemblies. Collectively, these structures ushered in the “resolution revolution” cryo-EM would experience over the next decade (3).
Before achieving near-atomic resolutions, modeling of cryo-EM density maps was typically restricted to fitting known or related structures to the map or using feature-recognition tools to provide a low-resolution description of the protein components in a complex (27,28,29). For two of these four “new” near-atomic-resolution cryo-EM density maps, CPV and ε15, no known or related structures had been determined. As such, new modeling tools were developed to derive atomic models directly from the density map without the aid of a homologous or template structure, ultimately giving rise to new de novo modeling approaches for cryo-EM density modeling (28,30). For ε15, CPV, and GroEL, the foundation for de novo modeling was to establish a sequence to structure alignment; this alignment could “anchor” features seen in the density with sequence elements (24,25,26). Here, the sequence to structure alignment was formed by matching secondary-structure elements observed in the map with those predicted in the sequence of the individual proteins (31,32). Using these anchors, atomic models for the secondary structure could be built and loops connecting these elements could then be traced in the density map assuming idealized Cα-Cα distance. Large, bulky side-chain densities could serve as local registration points to ensure de novo modeling accuracy.
As more near-atomic-resolution cryo-EM density maps were produced, these ad hoc methods were formalized into more robust and accessible software packages. This new software also gave rise to new model refinement and validation tools specific for cryo-EM based models (30,33,34). By 2010, the first Cryo-EM Modeling Challenge was issued, providing the community with a set of target maps to test various modeling tools (35). Although there were no “winners” of the challenge, the challenge proved an invaluable tool for researchers to better understand the limits, pitfalls, and challenges of de novo modeling. Since the first challenge, a number of cryo-EM modeling challenges have been sponsored to address specific de novo modeling topics, including standards, validation, automation, and ligand identification (https://challenges.emdataresource.org/) (36,37).
Similar to cryo-EM, predictive modeling is undergoing a revolution (38,39). New modeling tools based on artificial intelligence and machine-learning methods, such as AlphaFold (40) and RoseTTAFold (41), can reliably predict the 3D structure of a wide range of single proteins and small complexes directly from the primary sequence. During CASP14 (42), models generated by AlphaFold for the target proteins demonstrated tremendous accuracy and were on par with models generated from experimental data (43,44,45). With the improved accuracy, predictive models could now be used for molecular replacement in X-ray crystallography (46,47,48,49), combined with mass spectroscopy (50) and NMR (51) or fitted into near-atomic resolution cryo-EM density maps (52,53). Currently, AlphaFold2 and its derivatives stand as the high-water mark in predictive modeling (54).
At its core, AlphaFold, developed by DeepMind, utilizes a machine-learning-based model trained on experimentally derived structures from the PDB to predict a set of potential models when given a protein sequence (40,55,56). Available as freely downloadable software or through online portals (57), model generation with AlphaFold2 is relatively simple, often requiring only the sequence of interest (Fig. 1, top row). Homologous templates can be entered alongside the sequence to help guide the predictions; a variety of settings and tolerances can also be specified to change the speed and number of predictions. In addition to the software, the European Bioinformatics Institute’s AlphaFold Database boasts over 200 million publicly available models predicted using AlphaFold2 (58). Moreover, multiple tools have been built to access and facilitate the usage of the AlphaFold Database, such as FoldSeek (59), a utility that searches the database for homologous proteins based on tertiary structure instead of sequence.
Figure 1.
Modeling cryoEM density with AlphaFold. A simplified pathway for modeling cryoEM density maps with predicted models from AlphaFold 2 is shown. In the orange-bordered boxes, a model for β-galactosidase was constructed with almost no human intervention: models were first generated from the sequence (row 2), the “best” model (row 3) was selected and fit to the cryoEM density map (row 4) and refined with Phenix (row 5). The blue boxed images illustrate a similar pathway for modeling SARs-Cov-2 Nsp2 density map with AlphaFold: models were generated from the sequence of Nsp2 (row 2), the best model (row 3) was fit to the density map (row 4) and then refined with Coot and Phenix (row 5). Five predicted models for each example, colored based on pLDDT, are shown in (A), as well as the PAE matrix (B), per-residue pLDDT plot (C), and model refinement step (shown in column D) are also shown on the sides of the diagram.
Despite the relative simplicity in generating predicted models with AlphaFold, the resulting models and output require considerable attention in interpreting accuracy and usability. Alongside the models (Fig. 1 A), AlphaFold2 outputs a predicted aligned error (PAE) matrix (Fig. 1 B), a predicted local distance difference test (pLDDT) score (Fig. 1 C), and a predicted template modeling (pTM) score (40). pLDDT (ranging 0–100) is a per-residue score based on the local distance difference test developed by Mariani et al. to evaluate stereochemical plausibility (60); an example of a per-residue pLLDT plot is shown in Fig. 1 C, whereas Fig. 1 D shows the pLDDT score mapped to the predicted model. Regions with pLDDT scores >90 are considered to have high accuracy and to be sufficient to evaluate atomic details and interactions. pLDDT scores of 70–90 are considered to be good backbone predictions, whereas 50–70 are considered low-confidence predictions. Regions with scores <50 may be poorly predicted, disordered, or highly flexible, and thus not as useful in interpreting the structure of interest. pTM (ranging 0–1) is a global score based on Zhang and Skolnick’s template modeling (TM) score, which quantifies the similarity between two protein structures (61,62). Although this is similar to the root-mean-square deviation (RMSD), TM score is weighted to prioritize the global congruity over local variance. Two random proteins have an average score of 0.17, whereas scores >0.5 can generally be considered to be in the same fold. Of course, there are exceptions when it comes to comparing homologous or metamorphic proteins. Two similar but distinct conformations can have a >0.5 TM score due to overall similarity but still be of different folds. In these cases, visual analysis is required to confirm the score. The PAE matrix quantifies the expected error between any two residues on the model and provides a between-domains confidence value (Fig. 1 B). When the value between residues from two domains is low, the model’s positions are good; if the value is high, then the model is likely unreliable. Ultimately, this collection of scores helps to guide users in interpreting both local and global accuracy of the predicted models.
With the success of AlphaFold, a number of machine-learning tools have arisen for modeling more complex structures. For large macromolecular assemblies, methods such as AlphaFold-multimer (63,64) and FoldDock (65) can generate models of heterogeneous protein assemblies. To provide additional biochemical context to model generation, AlphaFill incorporates ligands and metallic ions for rapid analysis of common ligand surface interactions with the target model (66). For custom modifications, Openfold is an open-source solution that allows users to retrain the underlying training dataset set as needed (67). AlphaFold2 is already being utilized for drug discovery: Ren et al. used AlphaFold within an artificial intelligence pipeline to generate models of targets that could be used to identify potential molecules for therapeutic applications (68). For protein design, AlphaFold powers AlphaDesign, a toolset developed by Jendrusch et al. to predict sequences that can bind to a target protein (69).
The synergy between the cryo-EM resolution revolution and the predictive modeling revolution is immediately obvious. Near-atomic-resolution density maps provide a relatively detailed structural description of a macromolecular assembly but require additional modeling tools to analyze, annotate, and construct atomic models for the complex. Conversely, predictive modeling is capable of producing highly accurate models of individual proteins but often lacks the context of the entire assembly or functional mechanism. By combining the two techniques, it is possible to quickly and accurately build atomic models for complex assemblies (53) (Fig. 1).
Integrating predictive modeling and cryo-EM predates the introduction of AlphaFold and the current generation of machine-learning tools. Early work in combining computational modeling and cryo-EM sought to use the density map as a constraint in the modeling software (70,71,72). In these approaches, the cryo-EM density map, which was typically less than 6-Å resolution, served as an envelope in fold-space, such that it could constrain model building, or as a target function for evaluating an ensemble of potential models. Accordingly, these approaches were limited by how accurately computational modeling could predict a structure from the sequence and how well the cryo-EM density map could constrain or identify a “good” model from a set of decoys. Although not as robust as fitting atomic models or as accurate as building models directly from the density, these constrained modeling approaches proved useful and paved the way for the future integration of predictive modeling and cryo-EM.
Similarly, new methods have arisen that combine experimental density maps with the current generation of predictive modeling tools. Terwilliger et al. implemented an iterative procedure that combines AlphaFold models with experimental density maps to improve and extend the original model (46,48). Here, AlphaFold models are automatically rebuilt using the density map and these rebuilt models are then used as templates for subsequent AlphaFold predictions. In other work, Terashi et al. introduced a method for evaluating cryo-EM models and applying local structure refinement. In this approach, AlphaFold2 models and multiple sequence alignments are used to identify and guide local refinement of the model (73).
In this review, we examine the progress of combining cryo-EM and predictive modeling for model building in large macromolecular complexes (Fig. 1). Here, we present several case studies that highlight the successes, failures, and limitations of this integrated approach. The power of integrating predictive modeling and near-atomic-resolution cryo-EM density maps is obvious, but, equally, there are limits to this approach and, ultimately, our understanding of the structure and function of macromolecular structure must be guided by experimental data.
Results
AlphaFold plus cryo-EM: When it just works
In many cases, predictive modeling can provide highly accurate models that can be fitted directly into the density map with little to no modification. In the following examples, the resolvability of features in the map was sufficient to guide selection and fitting of the models to the density with minimal alterations to the model itself.
Human complement factor C3 and invariant surface glycoprotein 65
In work by Sülzen et al., the invariant surface glycoprotein 65 (ISG65) from Trypanosoma brucei gambiense was determined to be the receptor for human complement factor C3 and its activation products (74). Additionally, it was shown that ISG65 specifically inhibits the activity of the alternate pathway C5 convertase, ultimately preventing cell lysis. The authors determined the cryo-EM structure of the human complement factor C3 and a proteolytic variant, C3b, in complex with ISG65 (Fig. 2 A). The resulting structures illustrated the structural basis of complement binding and led to a model for receptor-ligand interactions and, ultimately, a mechanism for immune escape of the parasite.
Figure 2.
Examples of AlphaFold2 models with cryoEM density maps. In (A), the map (EMDB: 14707) and model (PDB: 7ZGI) for the human complement factor C3 and ISG65 is shown. The map for SLH hemocyanin is shown in (B). In (C), the map (EMDB: 25817) and model (PDB: 7TDZ) for a protomer of the cytoplasmic ring of the nuclear pore complex from Xenopus laevis is shown. In (D), the map (EMDB: 25817) and model (PDB: 7TDZ) for SARs-CoV-2 Nsp2 is shown. The map (EMDB: 14774) and model (PDB: 7ZLI) for PTX3 is shown in (E). In (F), the map (EMDB: 35245) and model (PDB: 8I8B) for the base domain of the AcMNPV nucleocapsid is shown. The map (EMDB: 15954) and model (PDB: 8BBE) for IFT-A is shown in (G). The portion of the map (EMDB: 33403) and model for an asymmetric unit (PDB: 7XR2) of Mudcrab Reovirus is shown in (H). All images were made with ChimeraX. Individual chains in the model are colored distinctly.
Single-particle cryo-EM density maps for the C3:ISG65 and C3b:ISG65 complexes were reconstructed to ∼3.6-Å resolution, although local resolution varied in different portions of the map. A structure for C3 was previously determined but a model for ISG65 had to be constructed with AlphaFold2. Both the C3 and ISG65 models were docked into the C3:ISG65 reconstruction and refined with Coot (75), Phenix (76), REFMAC (77). A similar approach is shown in Fig. 1, where the predicted models were first fitted (fourth row) and then refined (fifth row) to optimally fit cryo-EM density. Several regions of the ISG65 model appeared disordered. Subsequent AlphaFold2 modeling with density and biochemical constraints, where the refined ISG65 model served as a template, were used to complete the disordered portions of the model, primarily in the head domain. AlphaFold2 was also used to extend the model in the C-terminal domain, composed of a long linker domain and long helix that anchors ISG65 to the cell membrane, resulting in a nearly full-length model for ISG65 (amino acids 18–436). The structures of the two complexes clearly depict how C3 and C3b interact with ISG65, leading to a model of complement binding on the surface of T. brucei gambiense.
SLH hemocyanin
Hemocyanins are multimeric oxygen transport proteins in some invertebrate animals, second only to hemoglobin in frequency as an oxygen transport molecule. Assembled in decameric building blocks in moluscum, hemocyanins form didecamer or higher-order assemblies with D5 or C5 symmetry. In work by Pasqualetto et al., single-particle cryo-EM was used to determine the didecamer (7.0-Å resolution) and tridecamer (4.7-Å resolution) structure of a novel hemocyanin from the slipper limpet Crepidula fornicata (SLH) (78) (Fig. 2 B). In the didecamer, the decamer base is formed through tail-tail interactions, whereas the tridecamer adds a decamer building block in a head-tail configuration. Analysis of the transcriptome and comparison to the keyhole limpet-type hemocyanin revealed that the SLH hemocyanin was likely formed by SLH1 or SLH2, which shares 93% sequence identity with SLH1.
At the given resolution, direct modeling of the density was not possible. Moreover, at 3424 amino acids, SLH1 was too large to model using AlphaFold, whose typical size limit for modeling is between 1400 and 1800 amino acids. However, the sequence was segmented into six serial overlapping fragments of 800 amino acids. Each sequence was then predicted with AlphaFold2 using Google Colab; the resulting models for the individual sequences were superimposed to create a full SLH1 model. From the predictions, two possible models emerged that could be fitted together to form a model for the SLH1 dimer. This repeating unit of the decamer was similar to the keyhole limpet-type hemocyanin. Fivefold symmetry was applied to generate the molecular models for the complete didecamer and tridecamer. The predicted models were then fitted to the cryo-EM density map of the didecamer and tridecamer, illustrating a good map-to-model fit and further elucidating higher-order oligomer formation in molluscan hemocyanins.
Nucleoporins
Although cryo-EM can routinely produce near-atomic-resolution density maps, significant structural information can be derived from lower-resolution density maps. In Fontana et al., single-particle cryo-EM was used to reconstruct the full cytoplasmic ring promoter and a core region of the nuclear pore complex (NPC) from Xenopus laevis to 6.9 and 6.7 Å, respectively (79) Fig. 2 C). NPCs are composed of 30+ nucleoporins (Nups) and have approximately eightfold symmetry. The cytoplasmic ring is on the cytosolic side of the NPC, whereas the inner and luminal rings are on the plane of the nuclear membrane and the nuclear ring faces the nucleus.
As no high-resolution models of X. laevis Nups were available and only some of the Nups have known structural homologs, AlphaFold was used to predict models for the Nups. Five models for each of the individual Nups were generated. For each protein, the top models based on pLDDT were selected, whereas pTM was used to select the top complexes. As the resolution of the map was sufficient to resolve α helices, fitting of the Nup models was based on the prominent α-helical features in the map. Nups with β-propeller domains or ambiguous fittings were also predicted as complexes with α helix-containing Nups to guide the fitting. The resulting models for the cytoplasmic ring revealed a number of surprising features, including asymmetry in the composition and interactions among Nups.
AlphaFold plus cryo-EM: Some intervention required
Although predictive modeling can provide reliable and accurate models for individual proteins, oftentimes these models must be adjusted and refined to the experimentally derived density map. The following illustrates a number of examples that utilized AlphaFold to build models for individual protein components that then had to be manipulated to fit the density map.
SARS-CoV-2 nonstructural protein 2
To combat the Covid pandemic, researchers investigated a variety of structural and nonstructural SARS-CoV-2 proteins as potential targets for vaccine and anti-viral therapies. SARS-CoV-2 nonstructural proteins fulfill a multitude of vital functions, often interacting with host proteins. These interactions, as well as their structure, are poorly characterized. Gupta et al. examined the SARS-CoV-2 nonstructural protein 2 (Nsp2), a protein implicated in replication and often mutated in variants of interest and concern (80) (Fig. 2 D) Nsp2 was isolated and imaged using single-particle cryo-EM resulting in a ∼3.8-Å-resolution structure (Zn-). A model for Nsp2 was constructed from the well-resolved portions of the density map using DeepTracer (81), COOT, and ISOLDE (82) and then refined with Phenix and Rosetta (83,84). Several putative zinc-binding sites were identified in the model; reimaging Nsp2 purified with zinc led to an improved 3.2-Å-resolution cryo-EM density map. Interestingly, in the zinc-bound Nsp2 structure, the C-terminal 130 amino acids were not resolved, although it was present in the Nsp2 density map without zinc, albeit at resolutions precluding model building directly from the density alone. As such, AlphaFold was used to complete the model of Nsp2.
Notably, Nsp2 was one of the targets for the CASP14 challenge, where 37 groups generated 135 predicted models of the protein (42). When compared to the Nsp2 model built from the density map, only the AlphaFold model had an RMSD <20 Å. Even the AlphaFold model had a poor global fit to the density map; however, visual analysis revealed strong local similarities. The best AlphaFold model was split into four domains and independently fitted to the density map. The model containing the C-terminal 130 amino acids fitted well to the cryo-EM density map without zinc. The individual domains were then stitched back together and refined with Rosetta to yield a nearly complete structure of Nsp2. When combined with analysis of sequence variation, Gupta et al. were able to suggest a number of biological roles for Nsp2 and regions of interest on the protein, including potential motifs for nucleic acid interactions.
PTX3
Examples such as Nsp2, where the individual domain structures are predicted correctly but their arrangement is not, are probably the most common examples of predictive modeling errors. These types of errors are relatively easy to fix with manual or automated modeling approaches; however, there are other ways that AlphaFold predictions can be utilized/influenced to better represent the experimentally determined density maps. The pentraxin family of proteins, most of which are associated with human innate immunity, are generally characterized by a PTX domain at the C terminus and arranged as a pentameric complex in a discoid shape. Pentraxins are grouped into two general classes, short and long pentraxins, where the long pentraxins contain a unique N-terminal domain. Although a number of short pentraxin structures have been solved, Noone et al. used cryo-EM to solve the structure of PTX3, a prototypical long pentraxin, to 2.5-Å resolution (85) (Fig. 2 E). The resulting structure, a glycosylated D4 symmetrical octamer, contained a considerably different ultrastructure as compared to other short pentraxins. The C-terminal pentraxin domain was clearly resolved, although the region of the cryo-EM density map corresponding to the N-terminal domain was not and only indicated the presence of α helices.
In the reconstruction of PTX3, only the region of the N-terminal domain closest to the PTX domain was resolved. Sequence analysis of the N terminus predicted a nearly continuous 100+ amino acid coiled-coil domain, agreeing in principle with the structure of a small portion of the N-terminal domain adjacent to the PTX domain. In modeling the N-terminal domain of PTX3, AlphaFold was used to generate a model. However, from sequence alone, AlphaFold was not able to predict the overall architecture of octameric PTX3. Noone et al. surmised that providing a homotetrameric coiled-coil template could seed the AlphaFold2 prediction. Using a canonical homotetrameric coiled-coil sequence fused to the C-terminal portion of the N-terminal region, AlphaFold was able to predict the tetrameric coiled-coil motif. The model for the N terminus was then concatenated to the PTX3 domains and refined. The presence of the coiled-coil motif was then validated using two-dimensional class averages, which were consistent with a central PTX core and flexible N-terminal regions protruding from either side of the PTX domains. The resulting model allowed for the mapping of ligand binding sites and could ultimately be important in designing therapeutics.
Bacculovirus nucleocapsid
Baculoviruses play an important role as biopesticides in insect control, as well as being essential tools in biomedical research, gene therapy, and vaccine development (86). Autographa californica multiple nucleopolyhedrovirus (AcMNPV) is the most well-studied member of Baculoviridae and was the first baculovirus to be sequenced (87,88). The virus contains a 134-kbp circular dsDNA genome encapsulated by an enveloped, rod-shaped nucleocapsid; the major capsid protein VP39 and additional minor proteins comprise the nucleocapsid. Using single-particle cryo-EM, the structure of the occlusion-derived virions of AcMNPV was solved to 3.2-Å resolution, revealing an intricate assembly of VP39 and six additional structural proteins that make up the head and base of the nucleocapsid (89) (Fig. 2 F).
Like many biological systems, only partial a priori structural information was available. VP39 was known to compose the cylindrical core of the virus, but the composition of the structural proteins at the base and head of the virus remained relatively unknown (90). At 3.2-Å resolution, a model for VP39 could be readily constructed de novo from the density map; the resolution at the base and head portions of the nucleocapsid was slightly lower at ∼4.2 Å. The lower resolution and lack of definitive knowledge of the proteins in the head and base precluded de novo model building. However, the resolvable features within the head and base could be used to discriminate between possible models. As such, models for all AcMNPV proteins were generated with AlphaFold2 and fitted to the head and base region using a hierarchical fitting scheme. In brief, unmodeled head and base densities were fitted with each AlphaFold model. Density that was well fitted by a model was assigned the corresponding protein and then subtracted from the intact map. For the remaining densities, each AcMNPV protein sequence was then broken into two halves and, as before, modeled with AlphaFold2 and fitted to the density map. Again, well-fitted density was assigned to the corresponding protein and the subtracted from the previous map. This process was repeated, eventually yielding models for six capsid proteins (AC98, AC101, AC104, AC109, AC 142, and AC144) that comprise almost the entirety of the asymmetric unit in the head and base regions of the reconstruction. Applying symmetry to these models and refining them to the density resulted in a near-complete atomic structure for the head and base of the nucleocapsid, revealing that the core VP39 cylinder is constricted by an outer shell ring composed of three capsid proteins, AC104, AC142, and AC109. Additionally, AC101 and AC144 appeared to form an elaborate inner layer, arranged with C14 symmetry, at both capsid head and base, whereas AC98, along with VP39, appear to hold the first genome strand in place at the base of the capsid. Based on this, the authors were able to suggest a mechanism for assembly and packaging of an exogenous genome.
AlphaFold plus cryo-EM: Modeling large complexes with AlphaFold-Multimer
The original implementation of AlphaFold focused primarily on the prediction of individual protein structures from a single sequence. In most cases, cryo-EM density maps of macromolecular assemblies are composed of many different proteins, interconnected through vast arrays of interactions that could not be captured with standard AlphaFold predictions. With AlphaFold-Multimer, homo- and hetero-oligomeric structures can be directly modeled, taking into account various interactions between the multiple proteins.
Intraflagellar transport A
Cilia require bidirectional intraflagellar transport (IFT) of signaling molecules, waste, and other materials to function properly; dysfunction of the IFT is linked with a variety of disease states in humans. Transporting proteins between the cilia and cell body, IFT trains are polymers consisting of two large complexes, IFT-A and IFT-B, as well as motor proteins (91). Hesketh et al. reconstituted the human IFT-A complex, a 767-kDa complex composed of six subunits (IFT43, IFT121, IFT122, IFT139, IFT140, IFT144) and used cryo-EM to reconstruct it, revealing unexpected subunit organization (92) (Fig. 2 G). This work illustrated how IFT-A polymerizes and interacts with IFT-B, as well as how it uses an array of β-propeller and TPR domains to create “carriages” that engage adaptor proteins.
Initial image processing revealed that IFT-A actually consists of two modules (IFT-A1 and IFT-A2), which are connected by a flexible linker. IFT-A2 was observed to be relatively rigid and reconstructed to 3.5-Å resolution, whereas IFT-A1 was more flexible and only reconstructed to 7- to 15-Å resolution. As with the other examples, model building for IFT-A2 began by direct de novo model building in the well-resolved portions of the density map. Using COOT, initial models for IFT122, IFT121, and IFT139 were constructed and completed by rigid and flexible fitting of AlphaFold models to the density map using MDFF (93) and ISOLDE. AlphaFold-Multimer was used to generate a prediction of a complex IFT139, IFT121, and IFT43; this model was fitted to the density and used to localize IFT43 to a region of unmodeled density between IFT121 and IFT139. To model IFT144 and IFT140 N-terminal β-propeller domains, AlphaFold-Multimer was used to generate models of these two protein domains, along with the C-terminal region of IFT122, which was then fitted to the IFT-A1 map using Chimera. Models for the remaining portions of IFT144 and IFT140 were also generated with AlphaFold and fitted to the density using MDFF with adaptive distance restraints in ISOLDE. Finally, the IFT-A1 and IFT-A2 models were joined via an overlapping region in IFT122, resulting in a complete IFT-A model that was then used to direct mutagenic experimentation to further analyze the mechanisms of IFT bidirectional transport.
AlphaFold plus cryo-EM: Limitations
Even when intervention is required to fit the experimental data, AlphaFold can provide relatively accurate predictions of protein structure, particularly at the level of protein domains. However, AlphaFold is not infallible and can generate results that are inconsistent with experimental data for a number of reasons.
Prions
Postulated over 60 years ago, Anfinsen’s thermodynamic hypothesis theorized that protein folding was guided by thermodynamics and that the native conformational state of a protein was the most thermodynamically stable shape in the cell. However, by the 1990s, intrinsically disordered proteins or domains were identified, and, within the next decade, proteins that adopt two unique three-dimensional conformations were discovered (94,95,96). Although AlphaFold is not intrinsically a template-based structural modeling tool, it does rely on the comprehensive knowledge of experimentally solved protein structures. So, the question then arises, how does AlphaFold2 handle the prediction of metamorphic proteins that have known structures in distinctively different states?
One of the most well-known family of metamorphic proteins are prions (97,98). The structure of the first prion protein, the benign PrPc isoform, was solved in 1996, revealing a globular mostly α-helical protein (99) (Fig. 3 A). Over a decade later, the structure of the infectious PrPsc isoform was solved, revealing a β strand-rich amyloid structure (100) (Fig. 3 B). To assess AlphaFold’s ability to predict the structure of a metamorphic protein, the full-length sequence for the human prion protein (UniProt: P04156·PRIO_HUMAN) was submitted to AlphaFold2, returning a series of mainly globular, α-helical proteins (Fig. 3 C). The top scoring model had a pLDDT score of 62.9 and a pTM of 0.424, indicating only moderate reliability of the prediction. Further examining the models, the N terminus of the prion protein structure had poor reliability (pLDDT < 0.5), whereas the core of the prion protein had relatively high reliability and resembled the previously described benign, globular PrPc isoform (PDB: 1QM0; RMSD, 0.932 Å). Truncating the human PrP sequence so that it contained only the “core” residues found in both the benign and infectious structural isoforms, residues 91–231, again produced the globular, α-helical structure and not the β strand-rich structure found in the infectious PrPsc isoform (PDB: 6LNI; RMSD, 16.06 Å) (Fig. 3 D). Attempts to use AlphaFold-Multimer also returned structures similar to the PrPc isoform and not the β strand-rich amyloid structure found in the PrPsc isoform. Moreover, increasing the number of seeds or introducing a template structure failed to produce any models consistent with the PrPsc isoform. The top model for the human PrP AlphaFold prediction had a pLDDT score of 76.6 and a pTM score of 0.6, indicating that the protein structure was relatively well predicted; however, fitting the predicted model into the human PrPsc cryo-EM density map with UCSF ChimeraX (101) resulted in a cross-correlation score of 0.3619 and no visual agreement with the structural features found in the map (Fig. 3 E and F). As such, it would appear that AlphaFold is capable of only producing models for one state of a metamorphic protein.
Figure 3.
AlphaFold and the human prion protein. In (A) the structure of the human PrPc is shown, while (B) shows the PrPsc isoform. Here both PrP structures are colored from N- (blue) to C- (red) terminus. The AlphaFold model for the full-length PrP sequence is shown in (C) and colored based on pLDDT score. In (D), the AlphaFold model for PrP is shown overlapped with the PrPc structure, shown in magenta. The AlphaFold model for PrP was fit to the PrPsc cryoEM density map in (E). While the correlation score is reasonable, the model clearly does not resemble the PrPsc map and model (F).
Mud crab reovirus
Mud crab reovirus (MCRV) is a 12-segment dsRNA reovirus that infects Scylla serrata. It belongs to a family of nonturreted reoviruses, Sedoreoviridae, all of which have an outer icosahedral capsid surrounding an icosahedral inner capsid; the outer capsid shell has 13 subunits per triangular face (T = 13), whereas the inner capsid has only two (pseudo T = 2). Despite having a common architecture, sequence similarity among various reovirus structural proteins varies considerably. In the work by Zhang et al., the authors determined the structure of both a quiescent and actively transcribing MCRV capsid to 3.1- and 3.4-Å resolution, respectively (102) (Fig. 2 H) using single-particle cryo-EM. Their structure revealed a complex set of interactions that stabilize the capsid and anchor the RNA-dependent RNA polymerase to the interior surface of the inner capsid proteins.
At the stated resolutions, modeling of the MCRV structural proteins could be done using standard de novo cryo-EM model-building tools. Complete models for two outer capsid proteins, VP11 and VP12, two conformations of the inner capsid (VP3A and VP3B), and the RNA-dependent RNA polymerase (VP1) were constructed and refined against the density map. Structural similarity among the VP12, VP3, and VP1 with corresponding structural proteins in other reoviruses could be seen even though only limited sequence similarity existed. VP11 appeared to contain a new fold and showed no sequence or structural similarity to other viral structural proteins.
In a separate work, Hryc and Baker attempted to assess the utility of using AlphaFold2 models in interpreting the structure of complex macromolecular assemblies from cryo-EM (53). In this work, models for the MCRV structural proteins were built using AlphaFold2. For each of the MCRV structural protein models, the pLDDT scores were relatively low. Although low pLDDT scores indicate low confidence in the accuracy of prediction, they do not necessarily indicate that the model is incorrect or does not fit the experimental data. However, in this case, the models produced by AlphaFold2 for the MCRV proteins were not consistent with the cryo-EM density map; the AlphaFold2 models for these proteins had an RMSD >5Å when compared to the experimentally derived models. It is interesting to note that a number of capsid protein structures from other reovirus have been solved and were likely included in the AlphaFold2 training set. The possible limitation of AlphaFold2 in this case could either be attributed to the complex set of interactions that bridge different types of symmetry among the various viral proteins or the potential dynamic nature of the structural proteins required for assembling, packaging, and stabilizing the capsid.
Discussion
Based on the work outlined here, it is increasingly evident that predictive modeling and cryo-EM provide a tantalizing mechanism for deciphering macromolecular structure and function. These examples represent a small number of recent works that have combined cryo-EM and AlphaFold to extend the model-building process beyond what was previously possible. To generate models with AlphaFold, only a protein sequence is required; the one caveat is that the maximum sequence length is approximately 1400–1800 amino acids at the present time. To “push” the AlphaFold prediction, template structures may be provided. Regardless of protein size or the use of a template, predictions with AlphaFold only take a few minutes to hours to generate a model de novo. The structures represented in these works are a direct product of availability, speed, accuracy, robustness, and ease of use in the AlphaFold prediction pathway.
As mentioned, the models themselves are not the only output of AlphaFold. Several statistical measures, such as pLDDT, pTM, and the PAE matrix, are provided to help the user assess the overall quality and robustness of the model. Unfortunately, not all published structures using AlphaFold or other predictive modeling programs report these scores. As seen in Table 1, which summarizes the examples discussed in this review, there is little consistency in reporting modeling scores. Some examples evaluated predictive models in the same way an experimental structure would be validated, whereas others reported only a selection of modeling statistics. Although it is beyond the scope of this work to suggest a set of criteria for predictive modeling, it is imperative that the community respond to the explosive growth in predictive modeling to establish a set of metrics that can be used to assess model quality and fit to experimental data.
Table 1.
Model Metrics
| Specimen | Reported Method | Template | Resolution | RMSD | Molprobity | pLDDT | pTM | CC |
|---|---|---|---|---|---|---|---|---|
| ISG65 | AlphaFold2 | Y | 3.58, 3.59 | 2.00 | 1.12–1.34 | – | – | – |
| Hemocyanin | AlphaFold2_advanced (ColabFold) | N | 4.7, 7.0 | – | – | 87.39–90.48 | 0.9–0.43 | – |
| Nucleoporins | AlphaFold2, AlphaFold-Multimer (ColabFold) | N | 6.7, 6.9 | 0.12–3.4 | – | 80.15–87.5 | 0.6–0.83 | 0.75–0.88 |
| Nsp2 | AlphaFold2 | Y | 3.15, 3.76 | 20+ | 0.86, 0.75 | – | – | – |
| PTX3 | AlphaFold2 (ColabFold v1.3) | Partial | 2.5 | – | 1.80 | displayed in figure | – | – |
| AcMNPV nucleocapsid | AlphaFold2 | N | 3.2 | – | – | – | – | 0.57–0.74 |
| IFT-A | AlphaFold2, AlphaFold-Multimer (Deepmind and ColabFold) | N | 3.5, 7.0 | 26.00 | 0.88 | displayed in figure | – | 0.77 |
| Human Prion | AlphaFold2 | N | 2.7 | 0.932, 16.06 | – | 62.9–76.6 | 0.424–0.60 | 0.36 |
| MCRV | AlphaFold2_advanced (ColabFold) | N | 3.1 | 18.35–54.38 | – | 29.12–43.99 | 0.13–0.31 | 0.05–0.17 |
A list of the nine examples and their corresponding statistics is shown. Column 2 reports the method used to generate the model and column 3 denotes if a template model was used in modeling (yes (Y)/no (N)), whereas column 4 reports map resolution (Å) reported by the authors. Model RMSD (Å) and molprobity score are reported in columns 5 and 6. Average pLDDT, pTM, and cross-correlation scores (CC, from 0 to 1.0) are shown in columns 7, 8, and 9 respectively.
Ultimately, the current set of output scores can provide a measure of model accuracy and/or reliability; however, they are merely guides and, as such, the models must be evaluated using experimental data. In Hryc and Baker, models for the bacteriophage syn5 capsid proteins were computed with AlphaFold2 (53). The pLDDT scores for these structural proteins were fairly low (below 70), yet fit the cryo-EM density map well with some human intervention to reorient the domains and refine the final models. If these models were not considered due to their relatively poor pLDDT scores, structures for these capsid proteins would likely not have been able to be constructed based on the density alone. In other examples, regions of low pLDDT have been used as markers of potential disorder in the protein (103,104,105).
Conversely, a model with a high pLDDT score does not necessarily mean it is correct. In the example of SARS-CoV-2 Nsp2, a current AlphaFold2 prediction of the score returns five models with average pLDDT scores ranging from ∼73 to 78 and pTM scores between 0.53 and 0.63; scores in this range indicate that the model is relatively reliable, particularly at the level of the overall backbone fold (Fig. 4 A and B; Table 2). However, when compared to the published structure of Nsp2 (80), RMSD for the models ranged from ∼14.5 to 44.4 Å and cross-correlation to the density map ranged from ∼0.5 to 0.54, indicating a relatively poor agreement between the predicted models, the published structure, and the cryo-EM density map. Examining the models more closely, the pLDDT scores for the N-terminal and C-terminal portions of the model indicated that these domains were relatively well predicted. The intervening residues had varied pLDDT scores among the five AlphaFold models. By dividing the model into three separate domains, N-terminal (residues 1–287), middle (residues 288–477), and C-terminal (residues 478–638), the individual domains could be more accurately fitted to the map, resulting in an RMSD of 2.934 Å and a cross-correlation of 0.68 (Fig. 4 C). A single round real-space refinement on this composite model using Phenix.real_space.refine (106,107) produced a final model with an RMSD of 2.148 Å and a cross-correlation of 0.757 (Fig. 4 D). With a cross-correlation value of 0.77 for the published structure compared to the density maps, the refined composite model was nearly identical to the published structure. So, despite the relatively good pLDDT and pTM scores, AlphaFold was only able to produce good local structure, requiring manual intervention to generate a model consistent with the density map.
Figure 4.
AlphaFold and Nsp2. The published structure of Nsp2 (magenta) is shown fit to the cryoEM density in (A). From the Nsp2 sequence, five models were generated with AlphaFold2. Each of the models was aligned to the known Nsp2 structure (magenta), shown in (B). Each AlphaFold model is colored based on pLDDT. Based on visual inspection, the model with the highest pLDDT was broken into three domains: N-terminal (blue), C-terminal (orange) and middle (green). Each of these domains was fit to the Nsp2 with ChimeraX. In (C), the three fit domains align well with the known Nsp2 structure. After annealing and refining the domains to the density map, the final refined AlphaFold model (yellow) had an RMSD of ∼2.1Å, shown in (D).
Table 2.
Modeling Nsp2
| Coordinates | pLDDT | pTM | RMSD | CC |
|---|---|---|---|---|
| Model 1 | 76.5 | 0.619 | 18.591 | 0.528 |
| Model 2 | 75.5 | 0.633 | 14.52 | 0.544 |
| Model 3 | 72.9 | 0.526 | 44.371 | 0.514 |
| Model 4 | 77.5 | 0.59 | 28.73 | 0.529 |
| Model 5 | 77.6 | 0.592 | 11.258 | 0.5 |
| N-terminal domain | – | – | 2.344 | 0.732 |
| Middle domain | – | – | 4.14 | 0.607 |
| C-terminal domain | – | – | 1.981 | 0.751 |
| Composite model | – | – | 2.934 | 0.68 |
| Refined model | – | – | 2.148 | 0.757 |
| 7msw | – | – | – | 0.77 |
Statistics for the AlphaFold-generated models are shown. In column 1, the model or domains are listed. Columns 2–5 list the average pLDDT score, pTM score, the RMSD compared to the published structure (Å), and the cross-correlation score (CC) after fitting to the cryo-EM density map using UCSF ChimeraX.
Even with some manual intervention, the power of AlphaFold and predictive modeling is apparent, and the use of such tools has wide-ranging implications. Predictive models could serve as key drivers in developing hypotheses, designing experiments, or validating experimental models. In another example from Hryc and Baker, AlphaFold2 was used to analyze the structure of another bacteriophage, ε15 (53). Here, AlphaFold2 was used to determine that the model for one of the capsid proteins, constructed directly from the cryo-EM density map, was incorrect and had been built from potentially the wrong protein sequence. Without a doubt, AlphaFold2 and predictive modeling have established the ability to accurately predict three-dimensional structure from sequence alone for a wide range of proteins. But has AlphaFold2 made structure determination trivial? AlphaFold2 was originally trained on a set of 170,000 single-chain protein structures from the PDB (40). As such, AlphaFold2 does not explicitly understand multimeric structures and how individual proteins interact with each other in forming complex assemblies. Thus, it is to be expected that AlphaFold2 will be limited in ability and accuracy when modeling large complexes. However, that is not to say that AlphaFold2 cannot model complexes or individual proteins in a complex with some level of confidence. Based on the examples presented here, as well as in other published works, AlphaFold2 predictions can serve as good initial models that may require some additional intervention to fit the experimental data. The advent of AlphaFold2-Multimer, which was trained on protein complexes, represents a step forward in the ability to simultaneously predict the structure of all protein components in a complex, although the success rate of the prediction is still significantly worse than the prediction of single-protein-chain structures (63,64).
Additionally, macromolecular assemblies can range from stable to transient complexes; thus, they can have multiple functional states and adopt a wide range of local and global structural changes to elicit a function. Experimental methods, such as cryo-EM, can capture some of these dynamics, often illuminating the structural basis for function (16,108,109). Predictions by AlphaFold2 essentially represent a single, static structural prediction of a protein; AlphaFold2 does not (yet) predict protein dynamics. As such, to extract dynamic information from AlphaFold2, the models must once again be fitted and refined to the experimental data. In some cases, such as prions, the metamorphic properties of the protein/complex are simply too great and AlphaFold2 fails to generate a reasonable model for not just the complex but the individual protein of interest. However, work by del Alamo used AlphaFold2 to sample alternative conformations of transporters and G-protein-coupled receptors by reducing the depth of the input multiple sequence alignments by stochastics sampling. The resulting models spanned the conformations between the known structures, providing a potential mechanism for generating an ensemble of models for use in characterizing a dynamic system (110). Additionally, the possibility of combining molecular dynamics with predicted models and experimental data offers a potential avenue for addressing motion in macromolecular assemblies. Generally, molecular dynamics approaches are limited to less than a millisecond and often only sample close to the original structure (111). Using a fitted predicted model or pieces of a predicted model, a cryo-EM density map could serve as an “envelope” to guide the simulation (72). Coupling this with methods that characterize sample motions in cryo-EM data (108,109,112) could provide for a fairly “high-resolution” simulation of macromolecular dynamics involving larger and more complex systems.
As a structure prediction tool, AlphaFold2 can generate a model for nearly any protein sequence. But how are those models affected by small changes to the sequence, or, in other words, will AlphaFold2 produce accurate models of proteins with mutations? Unfortunately, the way AlphaFold2 models are constructed, small changes to the sequence are unlikely to result in major structural changes. As an exercise, the sequence of the major capsid protein for P22, GP5, was submitted to AlphaFold2 (UniProt: P26747·CAPSD_BPP22) (Fig. 5). The predicted model for GP5 was very similar (3.057 RMSD) to the procapsid GP5 structure solved by single-particle cryo-EM to 2.6-Å resolution(113) (PDB: 8I1V). Randomly introducing single-amino-acid mutations in silico (X→Ala), from 0% to 30% of the entire sequence of GP5, did not result in any meaningful structural differences in the predicted model (RMSD ≤ 4.0 Å). Although increasing the number of mutations did lower the pLDDT score, the pLDDT score did not drop past 0.70 until 30% of the residues were mutated and never dropped below 0.50 in the examples shown. In fact, over 35% of the entire sequence had to be mutated to induce a significant structural difference (RMSD ≥ 5.0 Å); even sequences with 45% mutations still produced a model with a recognizable structure when compared to the known structure of GP5. Interestingly, although increasing the number of mutations did lower the pLDDT score, the pLDDT score did not drop past 0.70 until 30% of the residues were mutated and never dropped below 0.50 in the examples shown. While this was a simple in silico test of mutational tolerance, it remains unknown if similar levels of mutation in an expressed protein would result in authentic structural differences or altered capsid assembly.
Figure 5.
Mutations to P22 GP5. The structure of the major capsid protein from the P22 procapsid structure is show in the top left column. The GP5 model is colored from the N- (blue) to C- (red) terminus. In the remaining images, the top AlphaFold model based on pLDDT is shown. The number in the top-left corner indicates the percentage of amino acids randomly mutated (0-45%) and the number in the lower-right corner indicates the RMSD between the model and the known structure.
Conclusions
Although tremendous strides have been made in predictive modeling of protein structure and opened computational structural biology to a large number of researchers, predictive models, particularly those for large complexes or individual proteins in a complex, lack knowledge of their environment and dynamics. Predictive modeling can be a good starting point for structural characterization but its ability to model complex structural changes and modifications is limited. As such, modeling complex macromolecular assemblies requires knowledge and detail only currently afforded by direct experimental data. The examples shown here represent a wide range of cases that illustrate how AlphaFold models can be used to interpret protein structure in cryo-EM density maps and its potential limitations.
Although not directly addressed in the examples presented here, AlphaFold and other similar tools cannot currently model protein modifications, ligands, ions, lipids, and nucleic acids that are often part of large macromolecular complexes. Although there is obvious development in this area, experimental structure determination and modeling is the only way to resolve these features. Advances in predictive modeling will likely improve upon accuracy, reliability, and complexity of both protein and nonprotein components, and, when combined with experimental data, could represent an extremely efficient and powerful approach to address the sequence, structure, and function relationship in complex macromolecules.
Author contributions
M.R.C., H.V., and M.L.B. wrote the manuscript. C.F.R. and M.L.B. designed the experiments, performed the modeling, and analyzed the data.
Acknowledgments
This work was supported by a grant from the National Institutes of Health, United States (P01GM063210).
Declaration of interests
The authors have no competing interests to declare.
Editor: Meyer Jackson.
Footnotes
Corey F. Hryc’s present address is Fondren Orthopedic Research Institute, Houston, Texas.
References
- 1.Carugo O., Djinović-Carugo K. Structural biology: A golden era. PLoS Biol. 2023;21 doi: 10.1371/journal.pbio.3002187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Curry S. Structural Biology: A Century-long Journey into an Unseen World. Interdiscipl. Sci. Rev. 2015;40:308–328. doi: 10.1179/0308018815Z.000000000120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kühlbrandt W. The resolution revolution. Science. 2014;343:1443–1444. doi: 10.1126/science.1251652. [DOI] [PubMed] [Google Scholar]
- 4.Bai X.c., McMullan G., Scheres S.H.W. How cryo-EM is revolutionizing structural biology. Trends Biochem. Sci. 2015;40:49–57. doi: 10.1016/j.tibs.2014.10.005. [DOI] [PubMed] [Google Scholar]
- 5.Tan Y.Z., Carragher B. Seeing Atoms: Single-Particle Cryo-EM Breaks the Atomic Barrier. Mol. Cell. 2020;80:938–939. doi: 10.1016/j.molcel.2020.11.043. [DOI] [PubMed] [Google Scholar]
- 6.Nakane T., Kotecha A., et al. Scheres S.H.W. Single-particle cryo-EM at atomic resolution. Nature. 2020;587:152–156. doi: 10.1038/s41586-020-2829-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yip K.M., Fischer N., et al. Stark H. Atomic-resolution protein structure determination by cryo-EM. Nature. 2020;587:157–161. doi: 10.1038/s41586-020-2833-4. [DOI] [PubMed] [Google Scholar]
- 8.Strack R. Cryo-EM goes atomic. Nat. Methods. 2020;17:1175. doi: 10.1038/s41592-020-01014-1. [DOI] [PubMed] [Google Scholar]
- 9.Yao H., Song Y., et al. Li S. Molecular Architecture of the SARS-CoV-2 Virus. Cell. 2020;183:730–738.e13. doi: 10.1016/j.cell.2020.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ke Z., Oton J., et al. Briggs J.A.G. Structures and distributions of SARS-CoV-2 spike proteins on intact virions. Nature. 2020;588:498–502. doi: 10.1038/s41586-020-2665-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Shi W., Cai Y., et al. Chen B. Cryo-EM structure of SARS-CoV-2 postfusion spike in membrane. Nature. 2023;619:403–409. doi: 10.1038/s41586-023-06273-4. [DOI] [PubMed] [Google Scholar]
- 12.Jobe A., Liu Z., et al. Frank J. New Insights into Ribosome Structure and Function. Cold Spring Harbor Perspect. Biol. 2019;11 doi: 10.1101/cshperspect.a032615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Wang Y.-H., Dai H., et al. Zhou J. Cryo-electron microscopy structure and translocation mechanism of the crenarchaeal ribosome. Nucleic Acids Res. 2023;51:8909–8924. doi: 10.1093/nar/gkad661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Nasef M., Parker L., et al. Dokland T. Structure of the Streptococcus pneumoniae 70S Ribosome at 2.9 Å Resolution using Cryo-EM. Microsc. Microanal. 2023;29:938–940. [Google Scholar]
- 15.Matthies D., Bae C., et al. Swartz K.J. Single-particle cryo-EM structure of a voltage-activated potassium channel in lipid nanodiscs. Elife. 2018;7 doi: 10.7554/eLife.37558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Fan G., Baker M.R., et al. Serysheva I.I. Conformational motions and ligand-binding underlying gating and regulation in IP3R channel. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-34574-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Autzen H.E., Myasnikov A.G., et al. Cheng Y. Structure of the human TRPM4 ion channel in a lipid nanodisc. Science. 2018;359:228–232. doi: 10.1126/science.aar4510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Nojima S., Fujita Y., et al. Kobayashi T. Cryo-EM Structure of the Prostaglandin E Receptor EP4 Coupled to G Protein. Structure. 2021;29:252–260.e6. doi: 10.1016/j.str.2020.11.007. [DOI] [PubMed] [Google Scholar]
- 19.Kobayashi K., Shihoya W., et al. Nureki O. Cryo-EM structure of the human PAC1 receptor coupled to an engineered heterotrimeric G protein. Nat. Struct. Mol. Biol. 2020;27:274–280. doi: 10.1038/s41594-020-0386-8. [DOI] [PubMed] [Google Scholar]
- 20.Xia A., Yong X., et al. Yang S. Cryo-EM structures of human GPR34 enable the identification of selective antagonists. Proc. Natl. Acad. Sci. USA. 2023;120 doi: 10.1073/pnas.2308435120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Van Drie J.H., Tong L. Cryo-EM as a powerful tool for drug discovery. Bioorg. Med. Chem. Lett. 2020;30 doi: 10.1016/j.bmcl.2020.127524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Shepherd D.C., Dalvi S., Ghosal D. From cells to atoms: Cryo-EM as an essential tool to investigate pathogen biology, host-pathogen interaction, and drug discovery. Mol. Microbiol. 2022;117:610–617. doi: 10.1111/mmi.14820. [DOI] [PubMed] [Google Scholar]
- 23.Zhang X., Settembre E., et al. Grigorieff N. Near-atomic resolution using electron cryomicroscopy and single-particle reconstruction. Proc. Natl. Acad. Sci. USA. 2008;105:1867–1872. doi: 10.1073/pnas.0711623105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Yu X., Jin L., Zhou Z.H. 3.88 A structure of cytoplasmic polyhedrosis virus by cryo-electron microscopy. Nature. 2008;453:415–419. doi: 10.1038/nature06893. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Jiang W., Baker M.L., et al. Chiu W. Backbone structure of the infectious 15 virus capsid revealed by electron cryomicroscopy. Nature. 2008;451:1130–1134. doi: 10.1038/nature06665. [DOI] [PubMed] [Google Scholar]
- 26.Ludtke S.J., Baker M.L., et al. Chiu W. De Novo Backbone Trace of GroEL from Single Particle Electron Cryomicroscopy. Structure. 2008;16:441–448. doi: 10.1016/j.str.2008.02.007. [DOI] [PubMed] [Google Scholar]
- 27.Jiang W., Baker M.L., et al. Chiu W. Bridging the information gap: computational tools for intermediate resolution structure interpretation. J. Mol. Biol. 2001;308:1033–1044. doi: 10.1006/jmbi.2001.4633. [DOI] [PubMed] [Google Scholar]
- 28.Baker M.L., Zhang J., et al. Chiu W. Cryo-EM of macromolecular assemblies at near-atomic resolution. Nat. Protoc. 2010;5:1697–1708. doi: 10.1038/nprot.2010.126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Baker M.L., Ju T., Chiu W. Identification of secondary structure elements in intermediate-resolution density maps. Structure. 2007;15:7–19. doi: 10.1016/j.str.2006.11.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Baker M.L., Baker M.R., et al. Chiu W. Gorgon and pathwalking: Macromolecular modeling tools for subnanometer resolution density maps. Biopolymers. 2012;97:655–668. doi: 10.1002/bip.22065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Abeysinghe S.S., Ju T., et al. Baker M. Proceedings of the 2007 ACM Symposium on Solid and Physical Modeling. 2007. Shape modeling and matching in identifying protein structure from low-resolution images; pp. 223–232. [Google Scholar]
- 32.Abeysinghe S., Ju T., et al. Chiu W. Shape modeling and matching in identifying 3D protein structures. Comput. Aided Des.Aide. 2008;40:708–720. [Google Scholar]
- 33.Baker M.L., Abeysinghe S.S., et al. Ju T. Modeling protein structure at near atomic resolutions with Gorgon. J. Struct. Biol. 2011;174:360–373. doi: 10.1016/j.jsb.2011.01.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Baker M.R., Rees I., et al. Baker M.L. Constructing and validating initial Cα models from subnanometer resolution density maps with pathwalking. Structure. 2012;20:450–463. doi: 10.1016/j.str.2012.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ludtke S.J., Lawson C.L., et al. Chiu W. The 2010 cryo-em modeling challenge. Biopolymers. 2012;97:651–654. doi: 10.1002/bip.22081. [DOI] [PubMed] [Google Scholar]
- 36.Lawson C.L., Kryshtafovych A., et al. Chiu W. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat. Methods. 2021;18:156–164. doi: 10.1038/s41592-020-01051-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lawson C.L., Chiu W. Comparing cryo-EM structures. J. Struct. Biol. 2018;204:523–526. doi: 10.1016/j.jsb.2018.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jumper J., Hassabis D. The Protein Structure Prediction Revolution and Its Implications for Medicine: 2023 Albert Lasker Basic Medical Research Award. JAMA. 2023;330:1425–1426. doi: 10.1001/jama.2023.17095. [DOI] [PubMed] [Google Scholar]
- 39.Lupas A.N., Pereira J., et al. Hartmann M.D. The breakthrough in protein structure prediction. Biochem. J. 2021;478:1885–1890. doi: 10.1042/BCJ20200963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Jumper J., Evans R., et al. Hassabis D. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.RoseTTAFold: Accurate Protein Structure Prediction Accessible to All - Institute for Protein Design.
- 42.Kryshtafovych A., Schwede T., et al. Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XIV. Proteins. 2021;89:1607–1617. doi: 10.1002/prot.26237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Jumper J., Evans R., et al. Hassabis D. Applying and improving AlphaFold at CASP14. Proteins. 2021;89:1711–1721. doi: 10.1002/prot.26257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Kwon S., Won J., et al. Seok C. Assessment of protein model structure accuracy estimation in CASP14: Old and new challenges. Proteins. 2021;89:1940–1948. doi: 10.1002/prot.26192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Pereira J., Simpkin A.J., et al. Lupas A.N. High-accuracy protein structure prediction in CASP14. Proteins. 2021;89:1687–1699. doi: 10.1002/prot.26171. [DOI] [PubMed] [Google Scholar]
- 46.Terwilliger T.C., Afonine P.V., et al. Adams P.D. Accelerating crystal structure determination with iterative AlphaFold prediction. Acta Crystallogr. D Struct. Biol. 2023;79:234–244. doi: 10.1107/S205979832300102X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Barbarin-Bocahu I., Graille M. The X-ray crystallography phase problem solved thanks to AlphaFold and RoseTTAFold models: A case-study report. Acta Crystallogr. D Struct. Biol. 2022;78:517–531. doi: 10.1107/S2059798322002157. [DOI] [PubMed] [Google Scholar]
- 48.Oeffner R.D., Croll T.I., et al. Terwilliger T.C. Putting AlphaFold models to work with phenix.process_predicted_model and ISOLDE. Acta Crystallogr. D Struct. Biol. 2022;78:1303–1314. doi: 10.1107/S2059798322010026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Simpkin A.J., Thomas J.M.H., et al. Rigden D.J. MrParse: finding homologues in the PDB and the EBI AlphaFold database for molecular replacement and more. Acta Crystallogr. D Struct. Biol. 2022;78:553–559. doi: 10.1107/S2059798322003576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Liu H. AlphaFold and Structural Mass Spectrometry Enable Interrogations on the Intrinsically Disordered Regions in Cyanobacterial Light-harvesting Complex Phycobilisome. J. Mol. Biol. 2022;434 doi: 10.1016/j.jmb.2022.167831. [DOI] [PubMed] [Google Scholar]
- 51.Li E.H., Spaman L.E., et al. Montelione G.T. Blind assessment of monomeric AlphaFold2 protein structure models with experimental NMR data. J. Magn. Reson. 2023;352 doi: 10.1016/j.jmr.2023.107481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.He J., Lin P., et al. Huang S.Y. Model building of protein complexes from intermediate-resolution cryo-EM maps with deep learning-guided automatic assembly. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-31748-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Hryc C.F., Baker M.L. AlphaFold2 and CryoEM: Revisiting CryoEM modeling in near-atomic resolution density maps. iScience. 2022;25 doi: 10.1016/j.isci.2022.104496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Yang Z., Zeng X., Chen R., et al. AlphaFold2 and its applications in the fields of biology and medicine. Signal Transduct. Targeted Ther. 2023;8:115. doi: 10.1038/s41392-023-01381-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Gao W., Mahajan S.P., et al. Gray J.J. Deep learning in protein structural modeling and design. Patterns. 2020;1 doi: 10.1016/j.patter.2020.100142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.AlQuraishi M. Machine learning in protein structure prediction. Curr. Opin. Chem. Biol. 2021;65:1–8. doi: 10.1016/j.cbpa.2021.04.005. [DOI] [PubMed] [Google Scholar]
- 57.Mirdita M., Schütze K., et al. Steinegger M. ColabFold: making protein folding accessible to all. Nat. Methods. 2022;19:679–682. doi: 10.1038/s41592-022-01488-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Varadi M., Anyango S., et al. Velankar S. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022;50:D439–D444. doi: 10.1093/nar/gkab1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.van Kempen M., Kim S.S., et al. Steinegger M. Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 2023;2023:1–4. doi: 10.1038/s41587-023-01773-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Mariani V., Biasini M., et al. Schwede T. IDDT: A local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics. 2013;29:2722–2728. doi: 10.1093/bioinformatics/btt473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Zhang Y., Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins. 2004;57:702–710. doi: 10.1002/prot.20264. [DOI] [PubMed] [Google Scholar]
- 62.Zhang Y., Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005;33:2302–2309. doi: 10.1093/nar/gki524. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Zhu W., Shenoy A., et al. Elofsson A. Evaluation of AlphaFold-Multimer prediction on multi-chain protein complexes. Bioinformatics. 2023;39 doi: 10.1093/bioinformatics/btad424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Evans R., O’Neill M., et al. Hassabis D. Protein complex prediction with AlphaFold-Multimer. bioRxiv. 2022 doi: 10.1101/2021.10.04.463034. Preprint at. [DOI] [Google Scholar]
- 65.Bryant P., Pozzati G., et al. Elofsson A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 2022;13 doi: 10.1038/s41467-022-28865-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Hekkelman M.L., de Vries I., et al. Perrakis A. AlphaFill: enriching AlphaFold models with ligands and cofactors. Nat. Methods. 2023;20:205–213. doi: 10.1038/s41592-022-01685-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Ahdritz G., Bouatta N., et al. Bio C. OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. bioRxiv. 2023 doi: 10.1101/2022.11.20.517210. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Ren F., Ding X., et al. Zhavoronkov A. AlphaFold accelerates artificial intelligence powered drug discovery: efficient discovery of a novel CDK20 small molecule inhibitor. Chem. Sci. 2023;14:1443–1452. doi: 10.1039/d2sc05709c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Jendrusch M., Korbel Id J.O., et al. Id S. AlphaDesign: A de novo protein design framework based on AlphaFold. bioRxiv. 2021 doi: 10.1101/2021.10.11.463937. Preprint at. [DOI] [Google Scholar]
- 70.Topf M., Baker M.L., et al. Sali A. Structural characterization of components of protein assemblies by comparative modeling and electron cryo-microscopy. J. Struct. Biol. 2005;149:191–203. doi: 10.1016/j.jsb.2004.11.004. [DOI] [PubMed] [Google Scholar]
- 71.Topf M., Baker M.L., et al. Sali A. Refinement of protein structures by iterative comparative modeling and CryoEM density fitting. J. Mol. Biol. 2006;357:1655–1668. doi: 10.1016/j.jmb.2006.01.062. [DOI] [PubMed] [Google Scholar]
- 72.Baker M.L., Jiang W., et al. Chiu W. Ab initio modeling of the herpesvirus VP26 core domain assessed by CryoEM density. PLoS Comput. Biol. 2006;2:e146. doi: 10.1371/journal.pcbi.0020146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Terashi G., Wang X., Kihara D. Protein model refinement for cryo-EM maps using AlphaFold2 and the DAQ score. Acta Crystallogr. D Struct. Biol. 2023;79:10–21. doi: 10.1107/S2059798322011676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Sülzen H., Began J., et al. Zoll S. Cryo-EM structures of Trypanosoma brucei gambiense ISG65 with human complement C3 and C3b and their roles in alternative pathway restriction. Nat. Commun. 2023;14:2403. doi: 10.1038/s41467-023-37988-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Casañal A., Lohkamp B., Emsley P. Current developments in Coot for macromolecular model building of Electron Cryo-microscopy and Crystallographic Data. Protein Sci. 2020;29:1069–1078. doi: 10.1002/pro.3791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Adams P.D., Afonine P.V., et al. Zwart P.H. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D. 2010;66:213–221. doi: 10.1107/S0907444909052925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Vagin A.A., Steiner R.A., et al. Murshudov G.N. REFMAC5 dictionary: Organization of prior chemical knowledge and guidelines for its use. Acta Crystallogr. D Biol. Crystallogr. 2004;60:2184–2195. doi: 10.1107/S0907444904023510. [DOI] [PubMed] [Google Scholar]
- 78.Pasqualetto G., Mack A., et al. Young M.T. CryoEM structure and Alphafold molecular modelling of a novel molluscan hemocyanin. PLoS One. 2023;18 doi: 10.1371/journal.pone.0287294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Fontana P., Dong Y., et al. Wu H. Structure of cytoplasmic ring of nuclear pore complex by integrative cryo-EM and AlphaFold. Science. 2022;376:376. doi: 10.1126/science.abm9326. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Gupta M., Azumaya C.M., et al. Verba A.M.K.A. CryoEM and AI reveal a structure of SARS-CoV-2 Nsp2, a multifunctional protein involved in key host processes. bioRxiv. 2021 doi: 10.1101/2021.05.10.443524. Preprint at. [DOI] [Google Scholar]
- 81.Pfab J., Phan N.M., Si D. DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on cov-related complexes. Proc. Natl. Acad. Sci. USA. 2021;118 doi: 10.1073/pnas.2017525118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Croll T.I. ISOLDE: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr. D. 2018;74:519–530. doi: 10.1107/S2059798318002425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Leman J.K., Weitzner B.D., et al. Bonneau R. Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nat. Methods. 2020;17:665–680. doi: 10.1038/s41592-020-0848-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.DiMaio F., Tyka M.D., et al. Baker D. Refinement of Protein Structures into Low-Resolution Density Maps Using Rosetta. J. Mol. Biol. 2009;392:181–190. doi: 10.1016/j.jmb.2009.07.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Noone D.P., Dijkstra D.J., et al. Sharp T.H. PTX3 structure determination using a hybrid cryoelectron microscopy and AlphaFold approach offers insights into ligand binding and complement activation. Proc. Natl. Acad. Sci. USA. 2022;119 doi: 10.1073/pnas.2208144119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Kost T.A., Kemp C.W. Fundamentals of Baculovirus Expression and Applications. Adv. Exp. Med. Biol. 2016;896:187–197. doi: 10.1007/978-3-319-27216-0_12. [DOI] [PubMed] [Google Scholar]
- 87.Ayres M.D., Howard S.C., et al. Possee R.D. The Complete DNA Sequence of Autographa californica Nuclear Polyhedrosis Virus. Virology. 1994;202:586–605. doi: 10.1006/viro.1994.1380. [DOI] [PubMed] [Google Scholar]
- 88.Van Oers M.M., Pijlman G.P., Vlak J.M. Thirty years of baculovirus-insect cell protein expression: from dark horse to mainstream technology. J. Gen. Virol. 2015;96:6–23. doi: 10.1099/vir.0.067108-0. [DOI] [PubMed] [Google Scholar]
- 89.Jia X., Gao Y., et al. Zhang Q. Architecture of the baculovirus nucleocapsid revealed by cryo-EM. Nat. Commun. 2023;14 doi: 10.1038/s41467-023-43284-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Rohrmann G.F. Baculovirus Molecular Biology; 2019. Baculovirus Molecular Biology. [Google Scholar]
- 91.Webb S., Mukhopadhyay A.G., Roberts A.J. Intraflagellar transport trains and motors: Insights from structure. Semin. Cell Dev. Biol. 2020;107:82–90. doi: 10.1016/j.semcdb.2020.05.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Hesketh S.J., Mukhopadhyay A.G., et al. Roberts A.J. IFT-A structure reveals carriages for membrane protein transport into cilia. Cell. 2022;185:4971–4985.e16. doi: 10.1016/j.cell.2022.11.010. [DOI] [PubMed] [Google Scholar]
- 93.Trabuco L.G., Villa E., et al. Schulten K. Molecular Dynamics Flexible Fitting: A practical guide to combine cryo-electron microscopy and X-ray crystallography. Methods. 2009;49:174–180. doi: 10.1016/j.ymeth.2009.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Dill K.A., Ozkan S.B., et al. Weikl T.R. The Protein Folding Problem. Annu. Rev. Biophys. 2008;37:289–316. doi: 10.1146/annurev.biophys.37.092707.153558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Trivedi R., Nagarajaram H.A. Intrinsically Disordered Proteins: An Overview. Int. J. Mol. Sci. 2022;23 doi: 10.3390/ijms232214050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Babu M.M., Kriwacki R.W., Pappu R.V. Versatility from protein disorder. Science. 2012;337:1460–1461. doi: 10.1126/science.1228775. [DOI] [PubMed] [Google Scholar]
- 97.Prusiner S.B. Molecular Biology of Prion Diseases. Science. 1991;252:1515–1522. doi: 10.1126/science.1675487. [DOI] [PubMed] [Google Scholar]
- 98.Colby D.W., Prusiner S.B. Prions. Cold Spring Harbor Perspect. Biol. 2011;3:a006833. doi: 10.1101/cshperspect.a006833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Riek R., Hornemann S., et al. Wüthrich K. NMR structure of the mouse prion protein domain PrP(121–231) Nature. 1996;382:180–182. doi: 10.1038/382180a0. [DOI] [PubMed] [Google Scholar]
- 100.Vázquez-Fernández E., Vos M.R., et al. Wille H. The Structural Architecture of an Infectious Mammalian Prion Using Electron Cryomicroscopy. PLoS Pathog. 2016;12 doi: 10.1371/journal.ppat.1005835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Pettersen E.F., Goddard T.D., et al. Ferrin T.E. UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Sci. 2021;30:70–82. doi: 10.1002/pro.3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Zhang Q., Gao Y., et al. Jiang W. The structure of a 12-segmented dsRNA reovirus: New insights into capsid stabilization and organization. PLoS Pathog. 2023;19 doi: 10.1371/journal.ppat.1011341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Akdel M., Pires D.E.V., et al. Beltrao P. A structural biology community assessment of AlphaFold2 applications. Nat. Struct. Mol. Biol. 2022;29:1056–1067. doi: 10.1038/s41594-022-00849-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Binder J.L., Berendzen J., et al. Oprea T.I. AlphaFold illuminates half of the dark human proteins. Curr. Opin. Struct. Biol. 2022;74 doi: 10.1016/j.sbi.2022.102372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Ruff K.M., Pappu R.V. AlphaFold and Implications for Intrinsically Disordered Proteins. J. Mol. Biol. 2021;433 doi: 10.1016/j.jmb.2021.167208. [DOI] [PubMed] [Google Scholar]
- 106.Wang Z., Hryc C.F., et al. Chiu W. An atomic model of brome mosaic virus using direct electron detection and real-space optimization. Nat. Commun. 2014;5 doi: 10.1038/ncomms5808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Afonine P.V., Poon B.K., et al. Adams P.D. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr. D Struct. Biol. 2018;74:531–544. doi: 10.1107/S2059798318006551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Punjani A., Fleet D.J. 3D variability analysis: resolving continuous flexibility and discrete heterogeneity from single particle cryo-EM. J. Struct. Biol. 2021;213 doi: 10.1016/j.jsb.2021.107702. [DOI] [PubMed] [Google Scholar]
- 109.Chen M., Ludtke S.J. Deep learning-based mixed-dimensional Gaussian mixture model for characterizing variability in cryo-EM. Nat. Methods. 2021;18:930–936. doi: 10.1038/s41592-021-01220-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Del Alamo D., Sala D., et al. Meiler J. Sampling alternative conformational states of transporters and receptors with AlphaFold2. Elife. 2022;11 doi: 10.7554/eLife.75751. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Baek M., Baker D. Deep learning and protein structure modeling. Nat. Methods. 2022;19:13–14. doi: 10.1038/s41592-021-01360-8. [DOI] [PubMed] [Google Scholar]
- 112.Punjani A., Fleet D.J. 3DFlex: determining structure and motion of flexible proteins from cryo-EM. Nat. Methods. 2023;20:860–870. doi: 10.1038/s41592-023-01853-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Xiao H., Zhou J., et al. Cheng L. Assembly and Capsid Expansion Mechanism of Bacteriophage P22 Revealed by High-Resolution Cryo-EM Structures. Viruses. 2023;15:355. doi: 10.3390/v15020355. [DOI] [PMC free article] [PubMed] [Google Scholar]





