Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Feb 18.
Published in final edited form as: J Chem Inf Model. 2020 May 18;60(5):2436–2442. doi: 10.1021/acs.jcim.0c00090

Practical Considerations for Atomistic Structure Modeling with Cryo-EM Maps

Doo Nam Kim §, Dominik Gront , Karissa Y Sanbonmatsu ‡,▵,*
PMCID: PMC7891309  NIHMSID: NIHMS1661714  PMID: 32422044

Abstract

We describe common approaches to atomistic structure modeling with single particle analysis derived cryo-EM maps. Several strategies for atomistic model building and atomistic model fitting methods are discussed, including selection criteria and implementation procedures. In covering basic concepts and caveats, this short perspective aims to help facilitate active discussion between scientists at different levels with diverse backgrounds.

Graphical Abstract

graphic file with name nihms-1661714-f0001.jpg

Introduction

Single particle analysis is a popular cryo-electron microscopy (cryo-EM) method that often elucidates higher resolution than tomography1. The ultimate goal of single particle analysis is an atomistic structure model that enables insight into mechanism, molecular dynamics, structure comparison and structure-based drug design. In many cases, constructing the atomistic model from the 3-D cryo-EM reconstruction maps represents a significant bottle neck2. Therefore, easier access to convenient modeling tools is of interest. Recently, Malhotra et al. provided an excellent review of atomistic structure modeling that introduces comprehensive lists of programs with respect to different cryo-EM map resolutions and validation3. Here, we focus on several programs with respect to homologous sequence and structure availability and initial correlation between map and model that can be applied to other programs as well. These programs often aim to automate most modeling processes to reduce modeling time and error. In fact, as cryo-EM biological sample preparation and grid optimization become more automated1,4, computational modeling procedures are becoming more automated4,5 relative to manual modeling methods6,7 as well. For example, although the Segger segments cryo-EM maps efficiently, overcoming limitations of labor-intensive work8, phenix.segment_and_split_map intends to integrate the segmentation and symmetry analysis with automated modeling-building9. Throughout this perspective article, we highlight roles of molecular dynamics (MD) for cryo-EM based atomistic structure modeling. However, the role of MD simulation for other purposes such as simulation of protein-ligand association events and dissociation rates10 and deciphering off-target effects in CRISPR-Cas911 should be recognized as well.

Atomistic model building

Once a 3-D cryo-EM reconstruction map is obtained, one needs to either find or build the atomistic molecule that will fit into the map. There are several methods of building atomistic structural models. The choice of building method depends on availability of sequence and structure (Figure 1). Availability of sequence can be ascertained from BLAST (Basic Local Alignment Search Tool) search12. When our target is a protein, PSI-BLAST (Position-Specific Iterative BLAST), rather than BLASTp (basic search for protein sequence against a protein database), finds distant relatives for the query protein13. PSI-BLAST has improved sensitivity over sequence–sequence comparison methods, as a sequence profile (built from a multiple alignment of homologous sequences) contains more information about the sequence family than a single sequence14. Although the PSI-BLAST takes longer to run since it searches iteratively, it retrieves more relevant and useful sequences for us to do homology modeling. Practically speaking, PSI-BLAST with the NCBI nr (non-redundant protein sequences) database is quite memory intensive, especially in light of the rapid growth of sequence databases in recent years15. The UniRef50 database, which is further clustered from UniRef100, is more manageable while maintaining high search speed and quality, where identical sequences and sub-fragments with 11 or more residues are placed into a single record16.

Figure 1.

Figure 1.

Atomistic structure modeling method flow chart.

Availability of sequence whose 3-D structure is already determined can be identified by a BLAST search toward the PDB (select database to ‘protein data bank’ rather than ‘nucleotide collection’ or ‘non-redundant protein sequences’). Here, if the BLAST search returns a match nearly identical to our target sequence, we can use the previously determined 3-D structure. In this case, the two 3-D structures may originate from a different organisms or conditions (often ligand existence) and are quite similar. Therefore, it is relatively straightforward to fit into a similar map conformation either by rigid-body fitting or flexible fitting1 apart from a few exceptional cases (e.g., domain swapping)17.

Homology modeling.

In the case that the atomistic molecule for target sequence does not exist, a novel structure building procedure is needed. Among structure prediction methods, homology modeling (comparative modeling, template-based modeling) is most accurate18. This procedure is based on the assumption that two different but homologous proteins still share similar 3-D shape, provided that they have not diverged substantially during evolution. A structure for a target protein can be therefore calculated based on an experimentally established structure of another protein, called a template. The selection of a template (or templates) for modelling and their alignment with a target sequence are the most important factors influencing final results. If the sequence identity is > 30%18 or the protein:protein alignment has an E-value (the probability of obtaining the alignment by chance) < 0.001 or the DNA:DNA alignment has an E-value < 10−10, homology modeling can be considered. Certainly, the threshold that characterizes whether the templates have adequate sequence similarity is variable among researchers19. For example, some scientists use retrieved sequences as long as the E-value < 0.01. Others choose an E-value < 0.0000119. To make matters more confusing (but more promising when the available sequences are lacking), homologous sequences do not always share significant sequence similarity20. Even when the overall similarity is low, some use multiple templates partially, as long as their local similarities are sufficiently high. All these prerequisites are summarized at the new Robetta webserver21, where it states that the accuracy of homology modeling mainly depends on whether similar sequences (homologs) exist in available sequence databases and in the PDB. This availability of similar sequences and structures correlates with the actual global distance test (GDT) to native structure and is conveniently provided as Robetta result files21. In order to extend the applicability of homology modelling, a broad variety of methods have been proposed to detect more distant homologs and to improve quality of alignments22. Classical alignment of two protein sequences can be a good solution when sequence identity is relatively high (say, above 50%). In a more difficult cases, one can align sequence profiles23, Hidden Markov Models14,24 or deep neural-network based contact maps18. Finally, one can align a target sequence profile with a template 3-D structure using a 3-D threading approach25. Several of these methods have been also published as web services, such as a very convenient-to-use Bioinformatics Toolkit26.

There are many methods to build a homology model based on a selected template. Here, we focus on Rosetta Comparative Modeling (RosettaCM) that more accurately modeled structures relative to other methods when the sequence identity is greater than 15%27 and has been continued to be the most consistent top performing method28 since it joined CAMEO (Continuous Automated Model EvaluatOn)29. Like other Rosetta applications (such as Ab initio folding and some loop modeling methods), RosettaCM requires fragments (see below). Once a user has sequence information, all procedures of RosettaCM can be run automatically from the new Robetta webserver21. If one wants to run locally either for transparent understanding of procedures or high performance modeling of thousands of molecules, all procedures of RosettaCM (including clustering of decoys30 and energy based selection after FastRelax31) can run automatically as long as a user provides sequence information32 with a help from BioPython33.

Ab initio modeling.

If we cannot find homologous sequences to target sequence, Ab initio modeling can be considered. Among many Ab initio modeling methods, here we describe a Rosetta Ab initio procedure which itself ranked high in recent CASP (Critical Assessment of Protein Structure Prediction) competitions and whose part of its score functions and sampling procedures have been used by many scientists including recent protein structure prediction using multiple deep neural networks34 and prediction of differences in free-energies35. All steps of Rosetta Ab initio procedures can be run automatically from the new Robetta webserver21. Local running of Rosetta Ab initio in either high performance computing or Linux workstation is also fairly straightforward as long as a user manually selects the final model36. The caveat of the Rosetta Ab initio protein structure prediction is that generally it is accurate up to 100 amino acids long and more accurate with low contact order proteins (e.g. α-helices) than high contact order ones (e.g. β-sheets). To overcome this limitation, co-evolution based distance constraints are needed37. Ab initio approaches are by far more computationally demanding than homology modeling38. By exploiting information from a template, one can calculate a few plausible structural models and accomplish modeling within one hour. In the case of Ab initio modeling however, one has to create at least 104 – 105 structural models, spending from 10 minutes to 1 hour of Rosetta computations for each of them. Therefore, while homology modelling can be easily done on a personal laptop, in-house Ab initio modelling can require access to considerable computational resources.

If one wants to incorporate cryo-EM map based structural information beginning in atomistic model building stage, RosettaES (enumerative sampling)39, phenix.map_to_model40, and pathwalker41 can be used. By using additional structural guidance from the cryo-EM map, these de novo modeling methods are able to model challenging regions which have been impossible to model with other methods. Therefore, relatively “high” (better) resolution of map (< 4 Å) is required.

Common features of homology modeling and Ab initio modeling.

Both RosettaCM and Rosetta Ab initio modeling use structural fragments to build a full 3-D structure of a target protein. Fragments, i.e., short backbone conformations, are extracted from known structures that were deposited to the PDB. Rosetta utilizes two sets of fragments: long and short ones, to introduce large and small changes to a modelled conformation, respectively. Traditionally 3-mers and 9-mers have been used42,43,44, but one can apply also longer or shorter fragments44, depending on the goal. Fragments can be computed by the fragment_picker45 application, which is also available online as a part of the Robetta web-server46. The program assesses how every segment from a large library of structures fits to a particular location in a target protein. The fitness (also known as score) function takes into account secondary structure match: predicted for a target vs observed for a fragment (SecondarySimilarity score) as well as similarity between the two sequence profiles (ProfileSimilarity score). A score based on a Ramachandran map (RamaScore) is used to prevent quite common mistakes with beta-branched amino acids. SecondarySimilarity is the most important part of the fragment scoring scheme while ProfileSimilarity has only minor effect on quality of fragments. The fragments are sequence-specific and must be computed independently for every protein sequence subjected to modelling. A set contains 400 fragments (200 short and 200 long ones) for each of the sequence position results.

For more accurate modeling, additional constraints from various sources (including chemical probing) can be used (Figure 1). For example, refinement using both NMR chemical shift and cryo-EM density outperforms cryo-EM density only refinement47. In another example, we used the Situs package48 to conveniently transform SAXS (small angle x-ray scattering) based solution volume to cryo-EM maps (mrc, situs types)4,49. Additionally, evolutionarily conserved sequence information can guide distance pair constraints50. These sequence-based distance constraints are similar to in distance constraints from NMR NOESY. They effectively reduce needed conformational sampling space allowing faster and more accurate simulation. Practically speaking, constraint information can be downloaded from the Gremlin webserver51 and can be incorporated easily into most Rosetta applications27. For co-evolution information to be helpful, at least 3~100 times of more number of sequences in the family than the length of the query protein are required52. The new Robetta webserver21 conveniently answers whether there are enough numbers of homologous sequences to the query sequence.

Usually, selecting “better” (negatively lower) Rosetta total energy/score alone designs intended protein models successfully53. However, to choose a more native-like model after simulation, one needs to pay attention to other factors as well including a protein function, stability, solubility and random mutations54. Therefore, one needs to select a frequently sampled55 and preferably sampled (via biased forward folding)44,38 decoy. To select a tightly packed design as in a native protein56, relaxing in dualspace (combined internal coordinate and cartesian relaxation)57 is recommended. Especially, one needs to make sure that the model does not have any unsatisfied polar bonds (making structures unstable)58 which is not easily captured by the Rosetta score function alone. Related RosettaScripts59 conveniently filter most of these traits (packing, unsatisfied polar bond) from many other decoys/candidate models. Upon final visual inspection, Foldit standalone60 conveniently highlights any remaining unsatisfied polar bonds or near-unsatisfied polar bonds. To analyze any clashes, PyMol61 can visualize Van der Waals (VDW) radii of each atom as spheres. These visual inspections can help to ensure correct chain connectivity/contact order of the models as well. For example, although Rosetta is very good at linking missing protein loop regions, sometimes it results in a structure whose continuous backbone protein chain is intertwined (not observed in native/foldable proteins). We needed to remove models with this intertwined continuous chain when modeling RNA structure as well62.

Flexible fitting of atomistic models

Once the initial atomistic model is obtained, one needs to fit the model into cryo-EM map. Selection of fitting methods depend on starting similarity between the initial atomistic model and cryo-EM map (Figure 2). For example, if the initial atomistic model is quite different structurally from the map (requiring larger radius of convergence), then rigid body fitting is better to be applied first. Once overall/global alignment by rigid body fitting between the model and map is high, then flexible fitting can be applied. Next, the local structure can be refined. However, if the initial atomistic model is close to the map, then flexible fitting, followed by refinement is often sufficient. Of course, if the backbone of initial atomistic model is already quite similar to the map, local structure refinement alone suffices. In our experience, rigid body fitting alone often cannot fit local geometrical features. Flexible fitting directly without rigid-body global fitting often fails due to too steep an energy gradient when the initial atomistic model is quite different structurally from the map4 (note that a newer version of phenix.cryo_fit63 helps to overcome this limitation). A similar analogy can be made for refinement only. For example, using refinement alone64 with cases where the model is significantly different from the map, the structure refinement can be trapped in a local energy minimum and not fit properly4. In our experience, successive running of rigid body fitting, flexible fitting and refinement is an optimal strategy. If the initial correlation between model and map is already high (the RMSD between initial atomistic model and map is less than a few Angstroms), refinement only64 achieves faster simulation speed, while producing fits similar to flexible fitting followed by refinement.

Figure 2.

Figure 2.

Atomistic structure fitting flowchart

Rigid body fitting.

As we implemented rigid body fitting, both ‘Fit in map’ module in UCSF Chimera65 and phenix.dock_in_map66 resulted in similar results (alignments) in almost all cases. The ‘Fit in map’ module in UCSF Chimera was sometimes faster and does not require map resolution information. However, when the initial model and map are far apart, phenix.dock_in_map automatically fits well, bypassing the requirement of manual initial placement. Therefore, when one needs to perform rigid body fitting for many structure-map pairs (hundreds to thousands), phenix.dock_in_map66, which enables commandline execution, is an excellent choice.

Flexible fitting.

In flexible fitting, a common approach is a cross-correlation coefficient-based method, where the MD potentials are biased by correlation between the experimentally derived map and a simulated map, achieving atomistic models consistent with the cryo-EM map. In terms of parallelization, atomic decomposition MD is suitable for small proteins, while the domain decomposition MD is better for larger systems such as ribosome67. Automatic running of this flexible fitting is provided by phenix.cryo_fit which can produce consistent results without the need of learning MD simulation4 (Figures 34 and Supporting Movie 1). This method has successfully fitted many DNA, RNA, and protein structures, including the ribosome and nucleosome, while preserving geometrical stereochemistry. Phenix.cryo_fit263, which runs entirely within the PHENIX software suite68 is currently under development. This package synergistically works with other PHENIX applications. For example, it can deal with most chemical entities easily using phenix.eLBOW69 and incorporates user designated distance constraints that come from other sources such as NMR, SHAPE70, FRET71 and SAXS49 to further improve model correctness while minimizing the sampling requirement. The method maintains secondary structure by using automatically generated secondary structure restraints72 and model idealization73. All these enhanced features of phenix.cryo_fit2 allows it to outperform phenix.cryo_fit in 7 cases out of 10 cases (for cryo-EM maps with 3~24 Å resolutions). In figures 34, we display early results for fits with quite large radii of convergence (i.e., the initial structure is quite far from the cryo-EM map. Despite these large radii of convergence, both the tRNA and full ribosome complexes undergo large conformational changes, enabling a close fit with the high resolution cryo-EM map74.

Figure 3.

Figure 3.

Flexible fitting of tRNA atomistic model into cryo-EM map a) before phenix.cryo_fit b) after phenix.cryo_fit (see supporting movie S1 for movie illustration). Flexible fitting produces a close fit despite a large radius of convergence.

Figure 4.

Figure 4.

Flexible fitting of ribosome complex atomistic model into cryo-EM map a) before phenix.cryo_fit b) after phenix.cryo_fit. Flexible fitting of the ribosome complex using phenix.cryo_fit produces a close fit despite a large radius of convergence. A large conformational change of the L1 stalk of the ribosome is required to achieve the fit.

Refinement.

Refinement aims to obtain a model with as high quality a fit as possible while possessing an expected geometry (no outliers). To achieve this, phenix.real_space_refine64 performs 5 macro-cycles of global real-space refinement with rotamer, Ramachandran plots and C-beta deviation restraints. On top of rotamer searching and structure minimization, Rosetta based refinement employs “fragment insertion” (protein backbone torsion angle replacement from probable/predicted protein fragments) for effective sampling of backbone dihedral bonds75. For refinement that needs a larger radius of convergence, MD simulation based methods, including those described above4,67 can be used as complementary techniques76,77.

Conclusion

Here, we briefly reviewed practical considerations of atomistic structure building and fitting with respect to single particle analysis based cryo-EM reconstruction maps. We note that although we focus on modeling with single particle analysis based maps, similar approaches are being developed for tomography based cryo-EM maps32, and SAXS based solution volumes49. Additionally, while we focus on atomistic structure modeling in this article, other primary tools for cryo-EM (e.g. from map-quality assessment to validation) are well summarized by Liebschner et al68. With more and more synergistic methods in development, fully automated atomistic structure modeling of cryo-EM maps may be possible in the near-future.

Supplementary Material

supplementary material
Download video file (8.8MB, mp4)

Acknowledgement

We appreciate PHENIX software team members’ support, advice and useful discussions. DG was supported by the National Science Centre (Poland) grant number 2018/29/B/ST6/01989. KS was supported by NIH NIGMS grant R01-GM072686, LANL LDRD, LANL Institutional Computing and NSF.

Footnotes

Supporting movie

One example of phenix.cryo_fit application is attached as a movie (supporting movie S1). The initial conformation of tRNA structure is fitted to the final conformation cryo-EM map.

References

  • (1).Kim DN; Sanbonmatsu K Tools for the Cryo-EM Gold Rush: Going from the Cryo-EM Map to the Atomistic Model. Biosci. Rep 2017, 37 (6), BSR20170072 10.1042/BSR20170072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (2).Lopez-Blanco JR; Chacon P Structural Modeling from Electron Microscopy Data. Wiley Interdiscip. Rev. Comput. Mol. Sci 2015, 5 (1), 62–81. 10.1002/wcms.1199. [DOI] [Google Scholar]
  • (3).Malhotra S; Träger S; Dal Peraro M; Topf M Modelling Structures in Cryo-EM Maps. Curr. Opin. Struct. Biol 2019, 58, 105–114. [DOI] [PubMed] [Google Scholar]
  • (4).Kim DN; Moriarty NW; Kirmizialtin S; Afonine PV; Poon B; Sobolev OV; Adams PD; Sanbonmatsu K Cryo_fit: Democratization of Flexible Fitting for Cryo-EM. J. Struct. Biol 2019, 208 (1), 1–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Afonine PV; Klaholz BP; Moriarty NW; Poon BK; Sobolev OV; Terwilliger TC; Adams PD; Urzhumtsev A New Tools for the Analysis and Validation of Cryo-EM Maps and Atomic Models. Acta Crystallogr. Sect. D Struct. Biol 2018, D74, 814–840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Yu I; Nguyen L; Avaylon J; Wang K; Lai M; Zhou ZH Building Atomic Models Based on near Atomic Resolution CryoEM Maps with Existing Tools. J. Struct. Biol 2018, 204 (2), 313–318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (7).Croll TI ISOLDE : A Physically Realistic Environment for Model Building into Low-Resolution Electron-Density Maps. Acta Crystallogr. Sect. D Struct. Biol 2018, 74 (6), 519–530. 10.1107/S2059798318002425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Pintilie GD; Zhang J; Goddard TD; Chiu W; Gossard DC Quantitative Analysis of Cryo-EM Density Map Segmentation by Watershed and Scale-Space Filtering, and Fitting of Structures by Alignment to Regions. J. Struct. Biol 2010, 170 (3), 427–438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Terwilliger TC, Adams PD, Afonine PV, S. O. Map Segmentation, Automated Model-Building and Their Application to the Cryo-EM Model Challenge Thomas. J Struct Biol 2018, 18, 30193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Jagger BR; Lee CT; McCammon JA; Amaro RE Computational Predictions of Drug-Protein Binding Kinetics with a Hybrid Molecular Dynamics, Brownian Dynamics, and Milestoning Approach. Biophys. J 2019, 116 (3), 562a. [Google Scholar]
  • (11).Ricci CG; Chen JS; Miao Y; Jinek M; Doudna JA; McCammon JA; Palermo G Deciphering Off-Target Effects in CRISPR-Cas9 through Accelerated Molecular Dynamics. ACS Cent. Sci 2019, 5 (4), 651–662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).NCBI Blast (Accessed 03/14/2020) https://blast.ncbi.nlm.nih.gov/Blast.cgi.
  • (13).Hu G; Kurgan L Sequence Similarity Searching. Curr. Protoc. Protein Sci 2019, 95 (1), e71. [DOI] [PubMed] [Google Scholar]
  • (14).Söding J Protein Homology Detection by HMM–HMM Comparison. Bioinformatics 2004, 21 (7), 951–960. [DOI] [PubMed] [Google Scholar]
  • (15).Kodama Y; Shumway M; Leinonen R; Collaboration, on behalf of the I. N. S. D. The Sequence Read Archive: Explosive Growth of Sequencing Data. Nucleic Acids Res 2011, 40 (D1), D54–D56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (16).Suzek BE; Wang Y; Huang H; McGarvey PB; Wu CH; Consortium, the U. UniRef Clusters: A Comprehensive and Scalable Alternative for Improving Sequence Similarity Searches. Bioinformatics 2014, 31 (6), 926–932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Gront D; Grabowski M; Zimmerman MD; Raynor J; Tkaczuk KL; Minor W Assessing the Accuracy of Template-Based Structure Prediction Metaservers by Comparison with Structural Genomics Structures. J. Struct. Funct. Genomics 2012, 13 (4), 213–225. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Zheng W; Wuyun Q; Li Y; Mortuza SM; Zhang C; Pearce R; Ruan J; Zhang Y Detecting Distant-Homology Protein Structures by Aligning Deep Neural-Network Based Contact Maps. PLOS Comput. Biol 2019, 15 (10), e1007411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (19).Combs SA; Deluca SL; Deluca SH; Lemmon GH; Nannemann DP; Nguyen ED; Willis JR; Sheehan JH; Meiler J Small-Molecule Ligand Docking into Comparative Models with Rosetta. Nat. Protoc 2013, 8 (7), 1277–1298. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (20).Pearson WR An Introduction to Sequence Similarity (“homology”) Searching. Curr. Protoc. Bioinforma 2013, Chapter 3, Unit3.1–Unit3.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).New Robetta webserver (Accessed 03/14/2020) http://new.robetta.org/.
  • (22).Schaeffer RD; Kinch L; Medvedev KE; Pei J; Cheng H; Grishin N ECOD: Identification of Distant Homology among Multidomain and Transmembrane Domain Proteins. BMC Mol. Cell Biol 2019, 20 (1), 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (23).Rychlewski L; Li W; Jaroszewski L; Godzik A Comparison of Sequence Profiles. Strategies for Structural Predictions Using Sequence Information. Protein Sci 2000, 9 (2), 232–241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (24).Steinegger M; Meier M; Mirdita M; Vöhringer H; Haunsberger SJ; Söding J HH-Suite3 for Fast Remote Homology Detection and Deep Protein Annotation. BMC Bioinformatics 2019, 20 (1), 473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Gniewek P; Kolinski A; Kloczkowski A; Gront D BioShell-Threading: Versatile Monte Carlo Package for Protein 3D Threading. BMC Bioinformatics 2014, 15 (1), 22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (26).Zimmermann L; Stephens A; Nam S-Z; Rau D; Kübler J; Lozajic M; Gabler F; Söding J; Lupas AN; Alva V A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at Its Core. J. Mol. Biol 2018, 430 (15), 2237–2243. [DOI] [PubMed] [Google Scholar]
  • (27).Song Y; DiMaio F; Wang RY-R; Kim D; Miles C; Brunette T; Thompson J; Baker D High-Resolution Comparative Modeling with RosettaCM. Structure 2013, 21 (10), 1735–1742. 10.1016/J.STR.2013.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (28).New Robetta webserver FAQ (Accessed 03/14/2020) http://new.robetta.org/faqs.php.
  • (29).Haas J; Barbato A; Behringer D; Studer G; Roth S; Bertoni M; Mostaguir K; Gumienny R; Schwede T Continuous Automated Model EvaluatiOn (CAMEO) Complementing the Critical Assessment of Structure Prediction in CASP12. Proteins Struct. Funct. Bioinforma 2018, 86 (S1), 387–398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (30).Li SC; Ng YK Calibur: A Tool for Clustering Large Numbers of Protein Decoys. BMC Bioinformatics 2010, 11 (1), 25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (31).Tyka MD; Keedy DA; André I; Dimaio F; Song Y; Richardson DC; Richardson JS; Baker D Alternate States of Proteins Revealed by Detailed Energy Landscape Mapping. J. Mol. Biol 2011, 405 (2), 607–618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (32).AutoHomology (Accessed 02/02/2020, contact authors afterwards) https://github.com/kimdn/AutoHomology.
  • (33).Cock PJA; Antao T; Chang JT; Chapman BA; Cox CJ; Dalke A; Friedberg I; Hamelryck T; Kauff F; Wilczynski B; de Hoon M Biopython: Freely Available Python Tools for Computational Molecular Biology and Bioinformatics. Bioinformatics 2009, 25 (11), 1422–1423. 10.1093/bioinformatics/btp163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (34).Senior AW; Evans R; Jumper J; Kirkpatrick J; Sifre L; Green T; Qin C; Žídek A; Nelson AWR; Bridgland A; Penedones H; Petersen S; Simonyan K; Crossan S; Kohli P; Jones D; Silver D; Kavukcuoglu K; Hassabis D Protein Structure Prediction Using Multiple Deep Neural Networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13). Proteins Struct. Funct. Bioinforma 2019, 87 (12), 1141–1148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (35).Kellogg EH; Leaver-Fay A; Baker D Role of Conformational Sampling in Computing Mutation-Induced Changes in Protein Structure and Stability. Proteins Struct. Funct. Bioinforma 2011, 79 (3), 830–838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (36).Analyzing Rosetta Results (Accessed 03/14/2020) https://www.rosettacommons.org/docs/latest/getting_started/Analyzing-Results.
  • (37).Simkovic F; Ovchinnikov S; Baker D; Rigden DJ Applications of Contact Predictions to Structural Biology. IUCrJ 2017, 4 (3), 291–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (38).Marcos E; Silva D-A Essentials of de Novo Protein Design: Methods and Applications. WIREs Comput. Mol. Sci 2018, 8 (6), e1374. [Google Scholar]
  • (39).Frenz B; Walls AC; Egelman EH; Veesler D; DiMaio F RosettaES: A Sampling Strategy Enabling Automated Interpretation of Difficult Cryo-EM Maps. Nat. Methods 2017, 14 (8), 797–800. 10.1038/nmeth.4340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (40).Terwilliger TC; Adams PD; Afonine PV; Sobolev OV A Fully Automatic Method Yielding Initial Models from High-Resolution Cryo-Electron Microscopy Maps. Nat. Methods 2018, 15 (11), 905–908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (41).Chen M; Baldwin PR; Ludtke SJ; Baker ML De Novo Modeling in Cryo-EM Density Maps with Pathwalking. J. Struct. Biol 2016, 196 (3), 289–298. 10.1016/j.jsb.2016.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (42).Murphy GS; Sathyamoorthy B; Der BS; Machius MC; Pulavarti SV; Szyperski T; Kuhlman B Computational de Novo Design of a Four-Helix Bundle Protein - DND_4HB. Protein Sci 2015, 24, 434–445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (43).Abbass J; Nebel J-C Reduced Fragment Diversity for Alpha and Alpha-Beta Protein Structure Prediction Using Rosetta. Protein Pept. Lett 2017, 24 (3), 215–222. [DOI] [PubMed] [Google Scholar]
  • (44).Koga N; Tatsumi-koga R; Liu G; Xiao R; Acton TB; Gaetano T Principles for Designing Ideal Protein Structures. Nature 2012, 491, 222–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (45).Gront D; Kulp DW; Vernon RM; Strauss CEM; Baker D Generalized Fragment Picking in Rosetta: Design, Protocols and Applications. PLoS One 2011, 6 (8), e23294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (46).Robetta webserver (Accessed 01/21/2020) http://robetta.bakerlab.org/.
  • (47).Leelananda SP; Lindert S Using NMR Chemical Shifts and Cryo-EM Density Restraints in Iterative Rosetta-MD Protein Structure Refinement. J. Chem. Inf. Model 2019. 10.1021/acs.jcim.9b00932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (48).Wriggers W Conventions and Workflows for Using Situs. Acta Crystallogr. Sect. D 2012, 68 (4), 344–351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (49).Kim DN; Thiel BC; Mrozowich T; Hennelly SP; Hofacker IL; Patel TR; Sanbonmatsu KY Zinc-Finger Protein CNBP Alters the 3-D Structure of LncRNA Braveheart in Solution. Nat. Commun 2020, 11, 148 10.1038/s41467-019-13942-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (50).Ovchinnikov S; Kamisetty H; Baker D Robust and Accurate Prediction of Residue-Residue Interactions across Protein Interfaces Using Evolutionary Information. Elife 2014, 2014 (3), 1–21. 10.7554/eLife.02030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (51).Gremlin webserver (Accessed 03/14/2020) http://gremlin.bakerlab.org/index.php.
  • (52).Ovchinnikov S; Kim DE; Wang RY-R; Liu Y; DiMaio F; Baker D Improved de Novo Structure Prediction in CASP11 by Incorporating Coevolution Information into Rosetta. Proteins Struct. Funct. Bioinforma 2016, 84 (S1), 67–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (53).Kim DN; Jacobs TM; Kuhlman B Boosting Protein Stability with the Computational Design of β-Sheet Surfaces. Protein Sci 2016, 25 (3), 702–710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (54).Kuhlman B; Baker D Native Protein Sequences Are Close to Optimal for Their Structures. Proc. Natl. Acad. Sci 2000, 97 (19), 10383 LP–10388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (55).Qian B; Raman S; Das R; Bradley P; McCoy AJ; Read RJ; Baker D High-Resolution Structure Prediction and the Crystallographic Phase Problem. Nature 2007, 450 (7167), 259–264. 10.1038/nature06249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (56).Sheffler W; Baker D RosettaHoles2: A Volumetric Packing Measure for Protein Structure Refinement and Validation. Protein Sci 2010, 19 (10), 1991–1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (57).Conway P; Tyka MD; DiMaio F; Konerding DE; Baker D Relaxation of Backbone Bond Geometry Improves Protein Energy Landscape Modeling. Protein Sci 2014, 23 (1), 47–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (58).Benjamin Stranges P; Kuhlman B A Comparison of Successful and Failed Protein Interface Designs Highlights the Challenges of Designing Buried Hydrogen Bonds. Protein Sci 2013, 22, 74–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (59).Fleishman SJ; Leaver-Fay A; Corn JE; Strauch EM; Khare SD; Koga N; Ashworth J; Murphy P; Richter F; Lemmon G; Meiler J; Baker D Rosettascripts: A Scripting Language Interface to the Rosetta Macromolecular Modeling Suite. PLoS One 2011, 6 (6), 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (60).Kleffner R; Flatten J; Leaver-Fay A; Baker D; Siegel JB; Khatib F; Cooper S Foldit Standalone: A Video Game-Derived Protein Structure Manipulation Interface Using Rosetta. Bioinformatics 2017, 33 (17), 2765–2767. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (61).The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC [Google Scholar]
  • (62).Wang J; Xiao Y Using 3dRNA for RNA 3-D Structure Prediction and Evaluation. Curr. Protoc. Bioinforma 2017, 57 (1), 5.9.1–5.9.12. [DOI] [PubMed] [Google Scholar]
  • (63).phenix.cryo_fit2 webpage (Accessed 03/14/2020) https://phenix-online.org/documentation/reference/cryo_fit2.html.
  • (64).Afonine PV; Poon BK; Read RJ; Sobolev OV; Terwilliger TC; Urzhumtsev A; Adams PD Real-Space Refinement in Phenix for Cryo-EM and Crystallography. Acta Crystallogr. Sect. D Struct. Biol 2018, D74, 531–544. 10.1101/249607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (65).Pettersen EF; Goddard TD; Huang CC; Couch GS; Greenblatt DM; Meng EC; Ferrin TE UCSF Chimera - A Visualization System for Exploratory Research and Analysis. J. Comput. Chem 2004, 25 (13), 1605–1612. 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
  • (66).Terwilliger Tom. phenix.dock_in_map (Accessed 03/14/2020) https://www.phenix-online.org/documentation/reference/dock_in_map.html.
  • (67).Mori T; Kulik M; Miyashita O; Jung J; Tama F; Sugita Y Acceleration of Cryo-EM Flexible Fitting for Large Biomolecular Systems by Efficient Space Partitioning. Structure 2018, 1–14. [DOI] [PubMed] [Google Scholar]
  • (68).Liebschner D; Afonine PV; Baker ML; Bunkóczi G; Chen VB; Croll TI; Hintze B; Hung L-W; Jain S; McCoy AJ; Moriarty NW; Oeffner R; Poon B; Prisant M; Read R; Richardson J; Richardson D; Sammito M; Sobolev O; Stockwell D; Terwilliger T; Urzhumtsev A; Videau L; Williams C; Adams PD Macromolecular Structure Determination Using X-Rays, Neutrons and Electrons: Recent Developments in Phenix. Acta Crystallogr. Sect. D 2019, 75 (10), 861–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (69).Moriarty NW; Grosse-Kunstleve RW; Adams PD Electronic Ligand Builder and Optimization Workbench (ELBOW): A Tool for Ligand Coordinate and Restraint Generation. Acta Crystallogr. Sect. D Biol. Crystallogr 2009, 65 (10), 1074–1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (70).Deigan KE; Li TW; Mathews DH; Weeks KM Accurate SHAPE-Directed RNA Structure Determination. Proc. Natl. Acad. Sci 2009, 106, 97–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (71).Dyla M; Terry DS; Kjaergaard M; Sørensen TL-M; Lauwring Andersen J; Andersen JP; Rohde Knudsen C; Altman RB; Nissen P; Blanchard SC Dynamics of P-Type ATPase Transport Revealed by Single-Molecule FRET. Nature 2017, 551 (7680), 346–351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (72).Sobolev O V; Afonine, P. V; Adams, P. D.; Urzhumtsev, A. Programming New Geometry Restraints: Parallelity of Atomic Groups. J. Appl. Crystallogr 2015, 48 (Pt 4), 1130–1141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (73).phenix.model_idealization (Accessed 03/14/2020) https://www.phenix-online.org/documentation/reference/model_idealization.html.
  • (74).Dunkle JA; Wang L; Feldman MB; Pulk A; Chen VB; Kapral GJ; Noeske J; Richardson JS; Blanchard SC; Cate JHD Structures of the Bacterial Ribosome in Classical and Hybrid States of TRNA Binding. Science 2011, 332 (6032), 981–984. 10.1126/science.1202692. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (75).Wang RYR; Song Y; Barad BA; Cheng Y; Fraser JS; DiMaio F Automated Structure Refinement of Macromolecular Assemblies from Cryo-EM Maps Using Rosetta. Elife 2016, 5 (September2016), 1–22. 10.7554/eLife.17219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (76).Leelananda SP; Lindert S Iterative Molecular Dynamics-Rosetta Membrane Protein Structure Refinement Guided by Cryo-EM Densities. J. Chem. Theory Comput 2017, 13 (10), 5131–5145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (77).Lindert S; Meiler J; McCammon JA Iterative Molecular Dynamics—Rosetta Protein Structure Refinement Protocol to Improve Model Quality. J. Chem. Theory Comput 2013, 9 (8), 3843–3847. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementary material
Download video file (8.8MB, mp4)

RESOURCES