DeepMainmast: Integrated Protocol of Protein Structure Modeling for Cryo-EM with Deep Learning and Structure Prediction

Genki Terashi; Xiao Wang; Devashish Prasad; Tsukasa Nakamura; Daisuke Kihara

doi:10.1038/s41592-023-02099-0

. Author manuscript; available in PMC: 2026 Jan 20.

Published in final edited form as: Nat Methods. 2023 Dec 8;21(1):122–131. doi: 10.1038/s41592-023-02099-0

DeepMainmast: Integrated Protocol of Protein Structure Modeling for Cryo-EM with Deep Learning and Structure Prediction

Genki Terashi ¹, Xiao Wang ², Devashish Prasad ², Tsukasa Nakamura ¹, Daisuke Kihara ^1,^2,^*

PMCID: PMC12815591 NIHMSID: NIHMS2126338 PMID: 38066344

Abstract

Three-dimensional structure modeling from maps is an indispensable step for studying proteins and their complexes with cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM maps has generally improved, there are still many cases where tracing protein main-chains is difficult, even in maps determined at a near atomic resolution. Here we developed a protein structure modeling method, DeepMainmast, which employs deep learning to capture the local map features of amino acids and atoms to assist main-chain tracing. Moreover, we have integrated Alphafold2 with the de novo density tracing protocol to combine their complementary strengths and achieved even higher accuracy than each method alone. Additionally, the protocol is able to accurately assign the chain identity to the structure models of homo-multimers, which is not a trivial task for existing methods.

Keywords: electron microscopy, cryo-EM, protein structure modeling, deep learning, structural biology, protein structure prediction

Introduction

An increasing number of protein structures have been modeled from cryo-electron microscopy (cryo-EM) maps. Although map resolution has generally improved steadily over the past years, there are still many situations in practice where modelers face difficulties in tracing main-chains. Moreover, it is observed that modeling errors frequently occur even in maps determined at a resolution around 2.5 to 3 Å ^1,2.

To aid structure modeling from cryo-EM maps, a new generation of methods has been developed over the past few years³. These methods include those that trace the main-chain through local points with high density values in the map using graph-based approaches^4–6, a method that uses structure fragment library^7,8 or structural prediction⁹, and methods that use deep learning to identify local density patterns that correspond to amino acids^10,11. These methods are particularly useful for de novo structure modeling, where experimentally determined structures are not available for fitting in maps with a resolution of around 3 to 5 Å, that fall beyond the scope of conventional modeling tools^12,13.

When discussing protein structure modeling of any kind, it is now essential to keep in mind the recent advances in protein structure prediction. Alphafold2 (AF2), a substantially accurate protein structure prediction method¹⁴ that is now being frequently used in structure modeling for cryo-EM. While there are numerous reported cases where models generated by AF2 were accurate enough to be fitted in the map¹⁵, there are also several instances where the global conformation of AF2 models does not agree with experimental data^16–18.

Considering all the factors surrounding structure modeling for cryo-EM, here we developed an integrated protein structure modeling protocol for cryo-EM that combines protein-main-chain tracing using deep learning and structure modeling using AF2. The primary component of the protocol is DeepMainmast(base), a de novo protein main-chain tracing method guided by identified positions of Cα atoms and types of amino acids by deep learning. Compared to its predecessor, MAINMAST⁵, DeepMainmast features several improvements, including the use of deep learning and a more effective main-chain tracing approach called the Vehicle Routing Problem¹⁹ (VRP) solver²⁰, which is better suited for multi-chain complexes. In addition to generating structure models with DeepMainmast(base), we also employ AF2 and blend its model with those from DeepMainmast(base) when they share the same local structure conformations. The protocol generates a total of 20 models, with different levels of blending of models, which are finally ranked based on the sum of the dot products of matched vectors (DOT) score²¹, a map-model fitness score, and the deep-learning-based amino-acid-wise model quality (DAQ) score¹. The full DeepMainmast protocol was shown to substantially outperform existing methods including AF2 on datasets of experimental EM maps of single-chain proteins as well as hetero and homo multi-chain protein complexes. Particularly, DeepMainmast is equipped with a specific protocol to accurately connect individual chains in homo-multimers, which is not an easy task and has not been addressed in other methods.

Results

Modeling Protocol Workflow

DeepMainmast takes a cryo-EM map and sequences of proteins that are included in the map as input, and outputs protein structure models. The overall protocol is shown in Fig. 1. The main and foremost component of the pipeline is DeepMainmast(base), the deep learning-assisted protein structure modeling method. In parallel to the model building with DeepMainmast(base), protein structures will be also predicted from the input sequences by AF2, which are hybridized with DeepMainmast(base) models if local conformations are compatible with each other. We also fit the AF2 models to the map using our VESPER algorithm, which finds the optimal superimposition by considering local tensors of density²¹. Models fitted to the map are finally ranked by the scoring function, which combines DAQ¹ and DOT²¹ scores that evaluate amino acids and atom propensity of local maps and local tensor agreement used in VESPER, respectively. In what follows, we briefly explain each step. For more details see Methods.

Figure 1. — We used a homo-multimer complex, magnesium channel CorA (PDB ID: 3JCF, EMID-6551, resolution: 3.8 Å), as an example. The magnified images in boxes highlight the transmembrane region. The yellow arrows represent the core of the protocol, DeepMainmast(base). It consists of six logical steps: (1) Detecting amino-acid types and atom types using deep learning (Emap2sf). The image on the left shows the detected atom types (Cα atom: green, carbon: orange, and nitrogen: light blue). The image on the right shows the detected amino acid types in different colors. (2) Tracing Cα path and assigning the target sequence using the Vehicle Routing Problem Solver and the Dynamic Programming algorithm. Different parameter combinations are used. (3) Assembling Cα fragments using the Constraint Problem (CP) Solver. Colors indicate chain IDs. (4) Combining Cα models built under different parameter combinations using the CP Solver. Colors indicate the direction of chains from blue to red for the N-terminal to the C-terminal residues. (5) Full-atom building and refinement using PULCHRA and Rosetta-CM. (7) Scoring generated full-atom models based on the DAQ(AA) score and the DOT score. For homo-oligomer targets, chain IDs are assigned based on the structural similarity of homomer proteins (black arrows). The gray arrows represent the procedure to use AF2 models. AF2 models were integrated into the modeling process in two ways, (i) Imposing fragments in the AF2 models to Cα fragments; and by (ii) Fitting AF2 models to the EM map using Map-Model fitting program, VESPER.

DeepMainmast(base)

Predicting amino acid and atom types using deep learning

The first step of the main-chain modeling by DeepMainmast(base) is to detect amino acids and atoms in a given EM map using a deep learning-based method, Emap2sf. Emap2sf uses a network of the U-net architecture²² and computes probabilities of twenty amino-acid types and six atom types (N, Cα, C, O, Cβ, Others) for each grid point in the density map. Computed probability values of twenty amino-acid types and three atoms (N, Cα, C) that form the backbone of protein structure are used in the following Cα tracing step.

Tracing Cα paths and assigning the protein sequence

Grid points that have a high probability for Cα are clustered by the mean shift algorithm²³ to generate representative points, named Local Dense Points (LDPs). Then, LDPs are connected to produce Cα paths using a VRP¹⁹ Solver ²⁰. VRP is a variation of the traveling salesman problem except that it uses multiple “vehicles” rather than a single “salesman” to explore the node space and connects nodes. Each vehicle in the VRP Solver explores the optimal routes from a pseudo starting point to connect a set of LDPs with minimum total costs of routes under specified constraints. We defined the cost between two LDPs based on the distance and the lowest probability of main-chain atoms along the path of two LDPs.

Once Cα paths are computed, they are aligned with the target sequence using the Smith-Waterman Dynamic Programming (DP) algorithm. To define the matching score between a Cα position in the path and each amino-acid type, we used the DAQ(AA) score¹ computed from the Emap2sf output to define the matching score between a Cα position in the path to each amino-acid type. Typically, 1,000–50,000 Cα path-sequence alignments, which we refer to as Cα fragments, are generated for a single-chain protein. We perform this process for each of combinations of three parameters: the Cα probability cutoff, the number of vehicles, and a parameter defining the cost function (see Methods).

Integration of AF2 structure prediction models

In the DeepMainmast protocol, we integrate predicted protein structures by AF2 (the right branch in Fig. 1). The AF2 model is integrated into DeepMainmast protocol in two steps; as a source of Cα fragments and as a global structure that is fit to the map (two gray arrows in Fig. 1 from AF2 models). We used AF2 version 2.1.0 and ran it without template protein structure data. Out of five full-atom models that AF2 generates, we used the top-ranked model based on the predicted local quality, pLDDT score.

Adding fragments from the AF2 model

The AF2 model was superimposed on the Cα fragments, and structure regions that agree sufficiently with the Cα trace by DeepMainmast(base) are extracted and added to the Cα fragment library. A local structure from an AF2 model needs to have a root mean square deviation (RMSD) of less than 1.5 Å for nine-residue or longer regions to be included in the library. Local structures from the AF2 model are extracted only when they satisfy the agreement criteria; thus, incorrect AF2 model regions would not be selected. The main benefit of this process is to supplement the Cα fragment library with AF2 fragments that fill missing regions, such as loop regions or terminal regions in the protein, which correspond to low-density regions in the map and could not be traced. Up to this point in the protocol, four parameters were used. We constructed a Cα fragment library for each of the parameter combinations, which resulted in total of 54 libraries for single-chain proteins and 108 libraries for multi-chain protein complexes. In the next step, a Cα protein model is generated for each of the libraries.

Assembling Cα fragments to build protein models

Generated Cα fragments are assembled to build Cα structure models. For each of the libraries, one model is generated. This process is considered as an optimization problem of selecting fragments, which we solve with the Constraint Programming (CP)²⁰ solver. The CP solver explores feasible combinations of Cα fragments that maximize the total DAQ score while keeping the consistencies of combined fragments. Three types of constraints are considered in the process: (1) no steric collision between Cα fragments; (2) no inconsistent positions for the same amino acid from different fragments; if the same amino acid exists in different fragments, the locations of the amino acids should not be more than 3.5 Å apart; (3) no inconsistent Cα−Cα distance between Cα atoms from different fragments. Extended Data 1 illustrates these constraints.

Fitting the AF2 model to the EM density map by VESPER

In addition to assembling Cα fragments for each of the libraries, we also simply superimpose the AF2 model to the density map with the structure fitting program, VESPER²¹, as candidates of structure models (the second gray arrow from the AF2 model box in Fig. 1). Density maps are simulated from the AF2 model at three different resolutions and ten superimpositions are computed for each of them. This gives a total of 30 superimposed AF2 models on the density map.

Combining assembled Cα fragment models

Up to this point, 54 or 108 assembled Cα fragment models for a single-chain and a multi-chain complex target, respectively, and 30 superimpositions of the AF2 model are generated. Since the fragment-based models may still have gaps and errors in sequence assignment in local regions, we further combine Cα fragment models. Each of the Cα models is fragmented into ten-residue-long Cα fragments, and then the new Cα fragments are assembled into new Cα models by the CP solver. The same assembling method using the CP Solver is used as described above. The 54 or 108 assembled Cα fragment models are classified into three groups depending on the origin of the fragments, and one final Cα model is constructed from each group (see Methods). 30 superimpositions of the AF2 model are separately considered, and one final model is constructed by assembling them. Thus, in total, four Cα models are generated.

Chain ID assignment for homo-multimer targets

When modeling a target with multiple chains, the chain ID is naturally labeled at the sequence assignment step. However, assigning chain ID is not trivial when the target has multiple chains with the identical sequence, because equivalent local sequence regions from different chains can be swapped in the modeling, making wrong chain conformations. To address this issue, we developed a specific step that optimizes consistent chain ID assignment for cases when the target is homo-multimer. This step is unique in DeepMainmast as no other existing method has a similar implementation.

Building full-atom models

The generated Cα models using the CP Solver are then subjected to full-atom structure building using PULCHRA²⁴. Missing regions in all full-atom models are subsequently filled and refined by Rosetta-CM⁷. For each Cα model, Rosetta-CM generates five full-atom models, resulting in total of twenty models generated from the four Cα models.

Evaluating generated models

The twenty full-atom models are evaluated and ranked with the DAQ¹ and DOT²¹ scores. The DAQ score evaluates the fit of the amino acid assignment at each amino acid position, and here we used the average DAQ score of all amino acid residues to evaluate a model. The DOT score from VESPER evaluates the agreement of local gradient directions of densities of the map and the simulated map from the protein model. Both normalized DAQ and DOT range from 0 to 1. The final score of the full-atom model is the sum of the two scores.

DeepMainmast outputs two versions of the models. In the first model, DAQ(AA) scores are given to each residue, but all residues of the target proteins are included in the output PDB file. In the second model, amino acids with low DAQ scores (less than −0.5) are labeled as UNK, and residues with low DAQ(Ca) scores (less than −0.5) are not included in the structural model. We used −0.5 as the DAQ score cutoff because a residue in a model under this cutoff is not very reliable as shown in our earlier studies^1,2.

Results of DeepMainmast(base) on single-chain modeling

First, we evaluated the performance of DeepMainmast(base) (yellow arrows in Fig. 1) using a dataset of single-chain structures from 29 experimental EM maps used also for evaluating MAINMAST⁵ (Supplementary Table 1_Dataset). In this dataset, the deposited PDB structures were used as the native (reference) structure to be compared against structure models. The density regions that correspond to proteins in the EM maps were manually segmented from the entire EM map using the “zone” tool in UCSF Chimera²⁵.

We discuss two versions of the models from DeepMainmast(base). The first version is models built by the procedure applied up to the Cα model building (Fig. 1) without applying the subsequent full-atom building step. The second version are the full-atom models. We show results of both versions but mainly discuss the first in Fig. 2 because the full-atom building step uses Rosetta-CM⁷, which fills gaps in a structure model and may distract objective comparison with existing methods. As the baseline, results were compared with five other de novo modeling methods, MAINMAST⁵, DeepTracer¹⁰, Buccaneer²⁶, phenix.map_to_model²⁷ in the Phenix package, and CryoFold²⁸. MAINMAST is the original version of DeepMainmast, which does not use deep learning. DeepTracer has a similar architecture as DeepMainmast(base) as it uses deep-learning to detect main-chain positions and amino-acid types. Buccaneer uses an electron density likelihood function to identify Cα positions and then traces protein chains by connecting Cα positions. phenix.map_to_model traces the backbone structure from the density map and the backbone models are then refined by considering secondary structures. CryoFold is a pipeline of three methods, a main-chain tracing (MAINMAST), molecular dynamics-based flexible fitting (ReMDFF²⁹), and modeling with MELD³⁰.

Figure. 2. — a, the average atom detection accuracy for each map by Emap2sf. Each dot shows the value of individual map. The bold numbers shown are the average values across all the maps. In this box plot, the center line, the bottom and the ceiling in a box show the median, first quartile, and third quartile value, respectively. The boundaries of whiskers show 1.5 of the distance between upper and lower quartiles. An atom was considered to be correctly predicted if more than half of its corresponding grid points have correct atom assignment. The statistics are calculated over n=29 independent targets, with each point’s values derived from Supplementary Table 2_AtomAcc. The values of minima, maxima, center, bounds of box and whiskers of different categories in order: N(0.36, 0.98, 0.89, 0.81/0.94, 0.71/0.98), CA(0.14, 0.98, 0.78, 0.70/0.88, 0.63/0.98), C(0.07, 0.93, 0.72, 0.62/0.79, 0.50/0.93), O(0.38, 0.98, 0.92, 0.85/0.96, 0.80/0.98), CB(0.09, 0.91, 0.68, 0.55/0.79, 0.23/0.91), Others(0.72, 0.98, 0.93, 0.92/0.95, 0.88/0.98). b, The average DAQ(AA) scores of each map. A dashed horizontal line was drawn at DAQ = 0. The statistics are calculated over n=29 independent targets, with each point’s values derived from Supplementary Table 3_DAQ(AA). The values of minima, maxima, center, bounds of box and whiskers of different categories in order: A(0.01, 0.62, 0.26, 0.18/0.54, 0.01/0.62), V(0.02, 0.46, 0.24, 0.16/0.36, 0.02/0.46), F(0.02, 0.70, 0.38, 0.28/0.49, 0.02/0.70), P(−0.03, 0.68, 0.34, 0.27/0.44, 0.14/0.64), M(−0.04, 0.47, 0.16, 0.07/0.20, −0.04/0.37), I(−0.03, 0.57, 0.23, 0.18/0.34, −0.03/0.57), L(0.02,0.67, 0.34, 0.22/0.46, 0.02/0.67), D(0.01, 0.41, 0.18, 0.12/0.22, 0.01/0.30), E(−0.01, 0.44, 0.18, 0.09/0.25, −0.01/0.44),K(0.02, 0.71, 0.24, 0.13/0.38, 0.02/0.71), R(0.02,0.70,0.37, 0.26/0.57,0.02/0.70), S(0.01, 0.63, 0.14, 0.09/0.28, 0.01/0.55), T(−0.01, 0.40, 0.14, 0.08/0.22, −0.01/0.40), Y(0.01, 0.86, 0.39, 0.32/0.57, 0.01/0.86), H(−0.04, 0.53, 0.07, 0.04/0.26, −0.04/0.53), C(−0.03, 0.93, 0.03, 0.00/0.24, −0.03/0.57), N(−0.03, 0.57, 0.18, 0.06/0.33, −0.03/0.57), W(0.00, 0.93, 0.55, 0.26/0.65, 0.00/0.93), Q(−0.01, 0.54, 0.18, 0.09/0.34, −0.01/0.54), G(0.02, 0.81, 0.34, 0.23/0.54, 0.02/0.81). c, the Cα coverage of the protein models. It is defined as the fraction of Cα atoms in a model that are placed within 3 Å to the correct position. The results by DeepMainmast(base) (the y-axis) were compared with MAINMAST (blue), DeepTracer (orange), Buccaneer (green), and phenix.map_to_model (Phenix, gray) (the x-axis). d, the amino acid matching accuracy. It is defined as the fraction of Cα atoms in a model which are placed within 3 Å to the correct position and have the correct amino acid type assignment. For computing matching Cα and matching amino acid type, we used the *phenix.chain_comparison* tool in *Phenix*. e, TM-Score computed with the TM-align program. f, the length of aligned regions between the model and the native structure by TM-align. These regions were used to compute RMSD in panel g. g. Cα RMSD of protein models.

The first two panels in Fig. 2 show the accuracy of structure detection by deep learning. Fig. 2a shows the atom detection accuracy. Atom positions were detected well with the average accuracy of five atom types ranging from 0.65 to 0.93. Of particular importance for the structure modeling protocol is the Cα atom detection, which achieved an accuracy of 0.76 on average for a map. For two maps, EMD-2364 and EMD-3246A (nanobody, PDB: 5foj chain A in PDB: in EMD-3246), the accuracy of detecting Cα atoms was low, measuring 0.14 and 0.32, respectively. EMD-2364 is determined at a relatively low resolution of 4.4 Å. For EMD-3246A, high density regions are fragmented in the nanobody part of the map, where predicting Cα atoms were less successful. Fig. 2b shows DAQ scores of amino acids in the target proteins in their correct positions in the map, which are used to align protein sequences to detected Cα paths in the map. The DAQ score ranges from negative to positive values and a positive value indicates that the amino acid fits well into the position in the map. The overall positive DAQ scores in Fig. 2b indicate that on average Emap2sf successfully detected the correct position of amino acids in the map. Bulky amino acids, e.g. Trp, Tyr, Arg, Phe, were detected with high DAQ scores. It is also interesting to note that negatively charged residues, Glu and Asp, which are known to susceptible to radiation damage^31–33, have relatively low DAQ scores. In contrast, hydrophobic amino acid residues, which are less affected by radiation damage³³ have higher DAQ scores. The atom detection accuracy and DAQ scores of individual maps are provided in Supplementary Table 2_AtomAcc and Table 3_DAQ(AA), respectively.

Figs. 2c-g present the accuracy of structure models. Results of individual maps are provided in Supplementary Table 4_Single_Model_Acc. These panels evaluate the first-version models that have only Cα atoms. Results of CryoFold are shown separately in Extended Data 2 because it failed to run on 11 among the 29 targets. The performance of CryoFold was similar to MAINMAST since CryoFold uses a MAINMAST model as the starting structure. Fig. 2c reports the fraction of Cα atoms in the native structure that were matched with any Cα atoms in the structure model within 3.0 Å. On average, DeepMainmast(base) identified 93.4% of Cα atom positions, which was higher than MAINMAST (82.5%), DeepTracer (89.5%), Buccaneer (78.4%), and Phenix (63.3%). Fig. 2d examines the accuracy of amino acid matching. It is the fraction of Cα atoms in the native structure that match with Cα atoms in the predicted model within 3.0 Å and also have the same amino acid type in the model. The average amino acid match of the DeepMainmast model was 80.7%, while that of MAINMAST, DeepTracer, Buccaneer, and Phenix models were 28.8%, 64.3%, 33.1%, and 19.6%, respectively (Fig. 2d). In Extended Data 3, we further analyzed the fraction of incorrect amino matchings caused by incorrect amino acid type assignment to otherwise correct Cα atom positions.

The Cα coverage (Fig. 2c) and the amino acid match accuracy (Fig. 2d) are sequence order-independent metrics. On the other hand, the subsequent metrics in Fig. 2, TM-Score³⁴ (Figs. 2e) and Cα RMSD (Fig. 2g) are sequence order-dependent metrics, which evaluate the topological accuracies of protein models. TM-Score ranges from 0 to 1, with 1 indicating the structures are identical. DeepMainmast(base) achieved a higher TM-score than the other four methods for all but 5 maps. The average TM-Score of DeepMainmast(base), MAINMAST, DeepTracer, Buccaneer, and Phenix were 0.83, 0.47, 0.66, 0.44, and 0.41, respectively. Compared to the structure models generated by DeepTracer, Buccaneer, and Phenix, the models built by DeepMainmast(base) typically include more correctly placed amino acids, as indicated by the higher number of orange data points in Fig. 2f. In some cases, models generated by these three methods contain Cα atoms that were not assigned to the correct sequence or position, causing them to be excluded from structural alignment by TM-align and resulting in shorter aligned lengths in Fig. 2f. In contrast, MAINMAST always includes all residues in the protein target, but DeepMainmast(base) occasionally misses a few residues (blue data points in Fig. 2f) due to its use of fragment-based modeling with the VRP solver. These missing residues can be filled in by Rosetta-CM in the subsequent step. Models by DeepMainmast(base) had on average lower Cα RMSD (Fig. 2g) than the other three methods despite that more residues were in general included in structure alignments considered. Against MAINMAST, DeepMainmast(base) had better Cα RMSD results, with only one exception, EMD-2364. EMD-2364 (PDB: 4btg-A) is relatively at a low resolution of 4.3 Å and the protein is large, 761 residue-long. DeepMainmast identified Cα positions comparable to DeepTracer but was not able to assign correct sequence position. The Cα coverage by DeepMainmast(base) as 0.68, but the amino acid matching accuracy was 0.04.

Extended Data 4 examines the accuracy of full-atom models. The performance of DeepMainmast(base) relative to the other methods were essentially the same as what we observed in Fig. 2.

Single-chain modeling on another dataset of 178 maps

We further benchmarked DeepMainmast on a different dataset of 178 experimental maps at a resolution of 5 Å or better, obtained from a recent work⁹ that proposed the CR-I-TASSER method. This time we examine models built by the full DeepMainmast protocol (Fig. 1), which includes integration of AF2 models. Four methods were compared: (full) DeepMainmast, DeepMainmast(base), AF2, and CR-I-TASSER. For the two versions of DeepMainmast, we performed full-atom model building and refinement, then, the top-scoring model was selected using DAQ and DOT scores. CR-I-TASSER combines protein structure prediction methods with prediction of Cα atom positions in an EM map using deep learning. This architecture is similar to the DeepMainmast protocol, which integrates DeepMainmast(base) with AF2. We used TM-Score to evaluate protein models since it is the only metric used in the CR-I-TASSER paper. Results for individual maps are provided in Supplementary Table 5_178targets.

DeepMainmast showed substantially higher accuracy than DeepMainmast(base) as shown in Figure 3a. Figures 3b and 3c compare DeepMainmast(base) with CR-I-TASSER and AF2. Among 178 maps, DeepMainmast(base) had a higher TM-Score for 103 cases (57.9%) than CR-I-TASSER (Fig. 3b). When compared with AF2, DeepMainmast(base) had a higher TM-Score for slightly more than half, 93 cases (52.2%). Upon AF2 integration (Fig. 3d and 3e), DeepMainmast outperformed the other two. Higher score was observed for 156 cases (87.6%) over CR-I-TASSER and 156 cases (87.6%) over AF2. Fig. 3f compares CR-I-TASSER with AF2. AF2 produced higher TM-Score models than CR-I-TASSER for 59.6% of the cases despite that AF2 does not consider EM map information.

Comparing marginal TM-Score distributions of DeepMainmast(base) and DeepMainmast (e.g. Fig. 3b and Fig. 3d), we noticed a set of maps that did not perform well by DeepMainmast(base), with a TM-Score around 0.2. All of these were found to be low-resolution (< 4 Å) maps. In Fig. 3g, we analyzed TM-Score of models relative to the map resolution. The figure shows TM-Score of models built by DeepMainmast(base) is negatively correlated with the map resolution, and they decreased particularly when the resolution becomes lower than 4 Å. At a resolution of 4 Å lower, map density starts to lose atom and residue level information, making it more difficult to correctly detect atom and amino acid residue information even with deep learning¹. The other three methods in Fig. 3g did not show a decline in TM-Score as the resolution decreases because they use structure prediction methods that do not rely on map information. Therefore, in the full DeepMainmast protocol, AF2 compensates well for the density-based main-chain tracing performed by DeepMainmast(base).

In Extended Data 5, we analyzed the performance of DeepMainmast(base) and DeepMainmast relative to local map resolution and secondary structure. The accuracy went down as the local resolution became worse particularly for DeepMainmast(base). The advantage of using AF2 models (i.e. DeepMainmast) was more noticeable when local map resolution is low, indicating AF2 models complemented inaccurate density tracing by DeepMainmast(base). Regarding the protein secondary structure, residues in helices were modelled more accurately than β strands and loops.

Examples of single-chain protein structure modeling

Fig. 4 presents six examples of protein structure models built by DeepMainmast. The first three (Fig. 4 a-c) are from the dataset of 29 single-chain targets, and highlight the Cα tracing accuracy. Fig 4a is a 504 residue-long viral protein determined at a 2.8 Å resolution (PDB 5FOJ chain B, EMD-3246). For this map, Cα positions were detected well by DeepMainmast(base) and DeepTracer with high Cα coverages of over 0.90 by both methods. However, TM-Score of the two methods turned out to be largely different, with a score of 0.95 for the DeepMainmast(base) model and 0.47 for the DeepTracer model, due to the difference in the accuracy of assigning amino acid type and the sequence order of the protein to identified Cα positions. The application of RosettaCM made a small improvement of the DeepMainmast(base) from 0.93 to 0.95. Fig. 4b and 4c are examples from maps determined at a lower resolution, 3.5 Å and 3.8 Å, respectively. The TM-Score was lower than the first example (Fig. 4a) for both methods, but DeepMainmast(base) maintained the score at 0.90 or higher. In the example of Figure 4c, a relatively large improvement of TM-Score from 0.85 to 0.90 was observed in the DeepMainmast(base) model by applying RosettaCM. This improvement is due to the supplementation of missing residues by RosettaCM including a loop of 11 residues and 12 other small gaps of one to four residues.

The latter three examples in Fig. 4d-f are from the dataset of 178 maps discussed in Fig. 3, and are shown to compare the full DeepMainmast with AF2 and also with CR-I-TASSER. Fig. 4d shows a 398-residue-long two domain structure of a virus capsid protein, VP6. The TM-Score of the AF2 model was 0.59 (Fig. 4d right), as the orientation of the two domains was incorrect although individual domain structures were folded correctly. This is a known problem of AF2 when applied to a protein in a cryo-EM map. CR-I-TASSER also did not model the entire structure correctly, resulting in a TM-Score of 0.64 as it was reported in their work⁹. In contrast, DeepMainmast(base) was able to trace almost the entire main-chain correctly (TM-Score: 0.99), which was perfected to 1.0 by considering fragments from AF2 in the full DeepMainmast pipeline. The next protein (Fig. 4e) is a challenging target due to a long, extended β-sheet stem region with a lower resolution density. CR-I-TASSER and AF2 built the correct fold for the globular domain of this protein but did not build the stem structure. DeepMainmast(base) was able to trace the overall main-chain conformation including the stem, except for the hairpin loop at the tip of the stem because the map does not have clear density there. The hairpin loop was not recovered by using AF2 fragments either because even AF2 did not model the region correctly. The last example (Fig. 4f) is a 407-residue-long protein that has a helix bundle fold. Since the EM map has a relatively low resolution of 4.1 Å, DeepMainmast(base) failed to build some α-helices and made wrong chain connections (Fig. 4f, left). AF2, on the other hand, built an overall accurate topology, but the relative positions of α-helices are slightly shifted, particularly the C-terminal helix (colored in red). DeepMainmast generated the most accurate model because it considered the local structure fragments from the AF2 model, which the DeepMainmast(base) could not build, and supplemented the density tracing result. This example illustrates how the AF2 model can enhance the modeling accuracy for challenging targets.

In Extended Data 6, we analyzed how the model accuracy improved at three major steps in the DeepMainmast protocol, Assembling Cα Fragments, Combining Models, and Building Full-Atom Models & Refinement.

Multi-chain structure modeling

We evaluated protein complex structure models built from cryo-EM maps, using 20 experimental maps of multi-chain complexes that were previously included in the maps of the 29 single-chain dataset used in Fig. 2. When we used these maps for testing the single-chain modeling ability, we segmented them to extract a single-chain region, but for multi-chain modeling we used the entire maps. The number of chains in the maps ranged from 3 to 14 chains. The largest target, EMD-5764, has five chains with a total of 4,082 residues. For reference, we ran DeepTracer on their webserver with the same input data. Results of individual maps are provided in Supplementary Table 6_MultiChain_results.

We first compare DeepMainmast with and without applying the chain ID assignment step (Fig. 5a). In all but one case, the chain ID assignment either improved or kept the TM-Score of the multi-chain complex models. The chain ID assignment algorithm can often substantially improve TM-Score by correcting fragment combinations for forming individual chains. In seven cases, TM-Score was approximately doubled by the chain ID assignment.

Next, we compare the performance of seven methods: DeepMainmast(base) and DeepMainmast with and without Chain ID assignment (four combinations), DeepTracer, Buccaneer, and phenix.map_to_model (Phenix), using four metrics (Supplementary Table 7). The full DeepMainmast, including the chain ID assignment, showed the highest performance for all the metrics. DeepMainmast had the highest average TM-Score score of 0.94, followed by DeepMainmast(base) and DeepMainmast without chain ID assignment, which had a score of 0.74, followed by DeepMainmast(base) without chain ID assignment (0.65), DeepTracer and Buccaneer (0.55), and Phenix (0.20). The scores of the all the metrics consistently improved when the chain ID assignment was applied in both DeepMainmast(base) and DeepMainmast. DeepTracer and Buccaneer assigned Cα atoms well with a Cα coverage of 0.91 and 0.85, respectively. However, they had lower values for the other metrics compared with the other methods mainly due to a low amino acid matching and sequence alignment accuracy. Fig. 5b and 5c compare TM-Score and the sequence assignment accuracy of DeepMainmast and DeepTracer. DeepMainmast (data points in blue) had higher values than DeepTracer for all the 20 targets. In Extended Data 7, we compared DeepMainmast with other three de novo modeling methods, DeepTracer, Buccaneer, and Phenix and found that DeepMainmast outperformed the others for almost all targets.

Fig 5d and 5e illustrate how the chain ID assignment improves complex models in the DeepMainmast protocol. Both examples, EMD-5155 (Fig. 5d) and EMD-6551 (Fig. 5e), are homo five-chain complexes. In both cases, DeepMainmast achieved high Cα coverage (0.81 and 0.94, respectively) and sequence identity (0.81 and 0.90, respectively) in the models even without chain ID assignment (the middle panel), indicating that amino acid positions and types were well detected and modelled (Supplementary Table 6_MultiChain_results). However, TM-Score of the models were relatively low, 0.40 and 0.63 respectively in Fig. 5d and 5e, because local segments were swapped between chains. This problem was fixed by the chain ID assignment (right panels), resulting substantial increase of TM-Score to 0.89 and 0.99, respectively.

In Fig. 5f and 5g, we compare modeling results of DeepMainmast with DeepTracer. Fig. 5f is a homo trimer complex of Rotavirus VP6 protein (EMD-1461). For this target, DeepMainmast assigned the sequence well with a sequence identity of 0.90 without the chain assignment step, but the TM-Score was 0.63 due to fragment swaps (the middle panel). The chain ID assignment corrected the fragment combinations, improving the TM-Score to 0.99 (the right panel). The DeepTracer model had a TM-Score of 0.56 and the sequence identity of 0.55. Fig. 5f is a 14-chain complex of Bordetella phage BPP-1 (EMD-5764), which consists of seven homo-oligomer of major capsid protein (chain A-G) and seven cementing proteins (chain H-N). This is a challenging target with many homo-oligomer chains and intricate structure. DeepTracer had a Cα coverage of 0.88, slightly higher than DeepMainmast (0.85), but the overall TM-Score became low (0.44) due to less than half (0.44) the amino acid matching accuracy and several incorrect chain connections. DeepMainmast also had a similar level of TM-Score of 0.51 without the chain ID assignment step, but the step substantially corrected chain assignment and recovered TM-Score to 0.85.

Discussion

We have developed DeepMainmast, a deep learning-based method for de novo protein structure modeling for cryo-EM maps. The target global map resolution range for DeepMainmast is from 2.5 to 5.0 Å as the neural network was trained to detect atoms and amino acids in maps of this resolution range. The modeling is fully automated, and no human intervention is needed. Overall, DeepMainmast achieved better performance than AF2 and the current state-of-art de novo modeling methods. We have recently developed a protocol called CryoREAD, which models the structure of nucleic acids from cryo-EM maps³⁵. Currently, we are combining Deep Mainmast and CryoREAD for modeling protein-nucleic acid complexes from cryo-EM maps.

Online Methods

Detection of local structural properties in an EM map

The first step of DeepMainmast(base) is to detect local structural properties in an input cryo-EM map using deep learning. The deep learning method we use here, named Emap2sf (Emap to structural features), has a U-shaped Network (UNet3+) architecture with skip connections (Extended Data 8). For each grid point in the density map, Emap2sf computes probabilities of twenty amino acid types; and six atom types (N, Cα, C O, Cβ, Others). Emap2sf consists of two UNet3+ network architectures, one focused on amino acid type detection and the other for atom type detection. It takes density values extracted in a box of 32³ Å³ and outputs probability values for 20 amino acid types and 6 atom types at each grid in the boxes. Emap2sf was trained on maps with a higher resolution (2.5 to 5 Å) as this is the targeted resolution range for protein structure modeling by DeepMainmast(base).

Training and validation datasets of experimental maps for Emap2sf

We first downloaded cryo-EM maps from EMDB³⁷ (the main maps in EMDB entries) that were determined at a resolution of 2.5 to 5.0 Å and have corresponding PDB entries deposited by the authors. Maps were removed if the deposited structures include DNA/RNA structures or unknown residues. To ensure the EM maps and associated PDB structures have proper alignments, we calculated the correlation between the experimental map and a simulated map from the deposited structure at the resolution of the experimental map. Maps were removed if the cross-correlation was lower than 0.65. Alignment between a map and the deposited structure was also manually inspected to confirm the agreement between the structure and the map. To build a non-redundant dataset, a map was removed if at least one protein chain pair from two maps share more than 25% sequence identity with each other. After applying all these steps, 237 cryo-EM maps remained. Among them, 197 maps (Supplementary Table 8_Training_Validation) were used for training and validation of Emap2sf and 40 maps (Supplementary Table 9_Test_set) were reserved as a test set.

The maps underwent pre-processing steps before being fed into the neural network. First, the grid size was unified to 1.0 Å, if the original grid size is different, by trilinear interpolation. Second, density values in a map were normalized to [0.0,1.0] using minimum-maximum normalization, in the same way as our previous work¹. Before the normalization, negative density values were set to 0, and 0 was used as the minimum value for a map. For the maximum value, we used the density value at the 98^th percentile and values larger than the cutoff were set to 1.

From each map, we collected boxes of a size of 32³ Å³ by scanning across the map along the three axes with a stride of 8 Å. We assigned each grid point in the box with an amino acid type and an atom type label that were taken from the closest heavy atoms of the closest residue located within 2.0 Å. If a grid point is close to more than two heavy atoms, then the closest heavy atom was used to provide the labels. A box was discarded if less than 0.1% of the grid points within it had amino acid or atom assignment.

Training the deep neural network of Emap2sf

Emap2sf has two separate UNets. The first UNet focuses on amino acid type detection and outputs probability values of twenty amino acids for each grid point to indicate the existence of corresponding amino acid. The second UNet is for atom type detection and outputs probability values for N, Cα, C O, Cβ, and Others. Both networks use the sigmoid activation function to output probability values ranging in [0.0, 1.0]. For each training batch, 64 density boxes were randomly selected from the 197 maps (164 for training and 33 for validation), which totaled around 210,000 and 45,000 boxes used in an epoch for training and validation, respectively. We used the dice loss³⁸. Deep supervision was used to apply the loss to the output of UNet from three different levels.

For the training of the network, we experimented with different combinations of a learning rate of [2e-5, 2e-4, 0.002, 0.02, 0.2] with an L2 regularization weight of [1e-6, 1e-5, 1e-4, 0.001, 0.01, 0.1] using the Adam optimizer³⁹. Among these combinations, the learning rate 2e-3 with L2 regularization parameter of 1e-5 showed the best grid-wise Intersection-over-Union (IoU) of 27.6% for amino acid detection and 53.1% IoU for atom detection on the validation set, although the performances with different combinations were similar. We used the same hyper-parameters for training both atom type detection UNet and amino acid type detection UNet.

Training and validation of the network took about 5 days using around 255,000 data. The computation was on two paralleled NVIDIA TITAN RTX 24 GB GPUs with a NVLINK connection.

The network (Extended Data 8) includes three encoder blocks and two decoder blocks. Both encoder and decoder blocks are built upon Three-Dimensional Convolutional Layer (Conv3d), Batch Normalization Layer, and ReLU activation. The network has about 7.4 million parameters to train. In contrast, the training dataset has 210,000 boxes with 32³ values that need to be assigned 20 probability values for amino acids and 6 probability values for atom types. This amounts to over 137 billion values to predict. As the number of values to predict is substantially larger than the number of parameters, our model is not able to memorize the information to predict.

Supplementary Table 2_AtomAcc shows the accuracy of Emap2sf on the 29 single-chain dataset. To evaluate the detection of structure features by Emap2sf on an EM map, the ground truth label of amino acids and atoms was assigned to each grid point following the same rules to assign labels for training. On average, the grid-wise accuracy of amino acid detection is 38.6% and the accuracy of atom type detection is 81.9%.

Tracing Cα paths

Using main-chain atoms detected in the cryo-EM map by Emap2sf, Cα traces will be generated. Instead of connecting local points with a high density, we use the computed Cα probability of each grid points. For maps of a higher resolution than 2.5 Å, which is higher than the resolution of maps in the training set, a Gaussian filter is applied using the “vop gaussian” command with a standard deviation of 0.9 in UCSF Chimera. As a preparation, representative points, called Local Dense Points (LDPs), in the EM map are generated from grid points with a Cα probability that is greater than a threshold $Θ$ . Then, LDPs are identified by the mean shift algorithm from the remaining points, which iteratively shifts each point toward points with a higher probability and cluster them locally to reduce points.

Then, LDPs in the map are connected to produce Cα path(s) using the VRP Solver. Instead of using a single agent (salesman), multiple agents (vehicles) are used to visit and connect nodes (x₁,..x_N) in a graph, i.e. Cα LDPs in the map starting from a pseudo-node x₀. The objective of VRP is to minimize the following cost function:

C o s t = \sum_{i = 0}^{N} \sum_{j = 0}^{N} \sum_{v = 1}^{M} C_{i, j} E_{i, j, v} + 200 \sum_{k = 1}^{N} d r o p_{k}

(Eq. 1)

where $N$ is the number of nodes (LDPs), $M$ is the number of vehicles, $C_{i, j}$ represents the cost of the path from the i-th node $(x_{i})$ to j-th node $(x_{j})$ , $E_{i, j, v}$ represents the edge and $E_{i, j, v} = 1$ if vehicle $v$ directly visited from $x_{i}$ to $x_{j}$ ; otherwise, $E_{i, j, v} = 0$ . drop_k controls the penalty cost if all vehicles did not visit the node $x_{k}$ . $d r o p_{k} = 1$ if $x_{k}$ was not visited by any vehicles; otherwise, $d r o p_{k}$ = 0. All solutions of the path need to satisfy the following conditions:

All nodes need to be visited not more than once, i.e.

\sum_{i = 0}^{N} \sum_{v = 1}^{M} E_{i, j, v} \leq 1, for j = 1, \dots N

(Eq. 2)

All paths (vehicles) must start from the pseudo-node x₀ and connect to other node or stay at x₀:

\sum_{j = 1}^{N} \sum_{v = 1}^{M} E_{i, j, v} \leq M, for i = 0

(Eq. 3)

When a vehicle visits a node, the vehicle needs to depart from the same node:

\sum_{i = 0}^{N} E_{i, p, v} - \sum_{j = 0}^{N} E_{p, j, v} = 0, for v = 1, \dots M, p = 0, \dots, N

(Eq. 4)

To define the cost $C_{i, j}$ between two nodes (x_i and x_j), we introduced two ideas, (1) the closer the distance between two connected nodes to the ideal Cα-Cα distance (3.8Å^4,10), the more likely the path is accurate, and (2) if Emap2sf does not predict a path between two nodes, then the two nodes likely do not have a connection between them. Following these ideas, we defined the cost $C_{i, j}$ as:

C_{i, j} = \{\begin{matrix} 0 & if i = 0 or j = 0, \\ 100 + 10 d_{i, j} & else if d_{i, j} > 10.0 o r M i n P_{i, j} = 0, \\ 100 - 100 \times \exp (\frac{- {(10 d_{i, j} - 38)}^{2}}{2 σ^{2}}) \times M i n P_{i, j} & otherwise \end{matrix}

(Eq. 5)

where $d_{i, j}$ represents the Euclidian distance between $x_{i}$ and $x_{j}$ , $M i n P_{i, j}$ is the lowest probability of backbone atoms (N, Cα, and C) on the straight line from $x_{i}$ to $x_{j}$ , $σ$ is the standard deviation of the Gaussian function. The cost between $x_{i}$ and $x_{j} (C_{i, j})$ will be close to zero if $d_{i, j}$ is close to 3.8Å and $M i n P_{i, j}$ is 1.0.

We varied values in the two parameters, $Θ = [0.3, 0.4, 0.5]$ , and M = [1, 5, 10] for single-chain targets or M = [5, 10, 20, 40] for multi-chain targets, and generated Cα paths for each combination to construct a pool of paths.

Assigning sequence to Cα traces to generate Cα fragments

The obtained Cα paths are aligned with the amino acid sequence using the Smith-Waterman Dynamic Programming (DP) algorithm. To quantify the match between an amino acid $a a_{i}$ , the i-th amino acid in a sequence, with the node $p_{j}$ in the path traced in the map, we used the DAQ score of amino acid types¹. The DAQ score was developed to assess likelihood that an amino acid exists at a designated position in an EM map to assess the quality of protein models built from the EM map¹. For a pair of $a a_{i}$ and $p_{j}$ , DAQ is computed as:

D A Q (i, j) = \log (\frac{P_{a a_{i}} (j)}{\frac{\sum_{k} P_{a a_{i}} (k)}{N_{node}}})

(Eq. 6)

where $P_{a a_{i}} (j)$ is the computed probability of amino-acid type $a a_{i}$ at the node $p_{j}$ by Emap2sf. $N_{node}$ is the total number of nodes in all the generated paths by the VRP Solver. Using DAQ score, a sequence-path alignment is computed with the following rule to fill a DP matrix $M$ :

M (i, j) = m a x \{\begin{matrix} M (i, j - 1) - 100 \\ M (i - 1, j - 1) + D A Q (i, j) \\ M (i - 1, j) - 100 \end{matrix}

(Eq. 7)

In the equation −100 is a large penalty for a gap in comparison with matching DAQ score, DAQ(i, j), which usually ranges between −2.0 to 2.0. For each Cα path, a total of four combinations of the threading process were performed, which come from two sequence directions and whether or not initializing negative DAQ with zero. The sequence threading on the Cα path was iterated $Ψ$ times. For each iteration, we initialize the $D A Q (i, j)$ with zero at aligned positions in the previous iteration round. By performing the iteration with the initialization of the $D A Q (i, j)$ , total of $Ψ$ different sequence-path alignments were generated from the same Cα path. For single-chain and multi-chain targets, we used $Ψ \in {5, 10}$ and $Ψ \in {5, 10, 20}$ , respectively.

After the threading process, computed Cα fragments $f_{n} (n = 1, \dots, N_{fragment})$ were evaluated by DAQ score. In the evaluation, we initialized negative $D A Q (i, j)$ to zero. If the distance between the two connected nodes in the Cα fragment was larger than 7 Å, the Cα fragment was split into two parts. Small Cα fragments that have less than five nodes were removed. For a single-chain protein of about 100 to 900 amino acids, typically 1,000 to 50,000 Cα fragments are generated.

Adding fragments from the AF2 model to the Cα fragment library

We use AF2 to predict the structure of the target protein and extract fragments from the AF2 model and use them in the subsequent modeling step only if they have some agreement in their local structures with the Cα fragments traced from the EM map. Among the five structure models that are generated by AF2, we use only one model that with the highest pLDDT score. The local structure agreement of the AF2 model is examined as follows^34,40: First, nine residue-long fragments area extracted from the AF2 model and from the traced Cα fragments with a stride of one residue, and they are exhaustively compared. A pair of nine-residue-long fragments are superimposed and they are considered to structurally agree with each other if the RMSD is less than 1.5 Å. Once such a pair with structural agreement is found, then the entire AF2 model is superimposed based on the alignment of the nine-residue-long fragments. Next, the alignment between the AF2 model and Cα fragments is updated by re-evaluating corresponding nodes between them, which should be located within a distance cutoff of 1.5 Å between each other. This process is iterated until the alignment converges or the number of iterations reaches ten. If the length (the number of corresponding nodes) of the final alignment is less than the minimum alignment length $Λ$ , the alignment is excluded. We used $Λ \in {30, 50}$ . Then, the identified AF2 fragments are stored in the Cα fragment library. What frequently occurs is that a continuous segment with a loop structure flanked by secondary structures on both sides in the AF2 model aligns with two disconnected Cα fragments of the secondary structure regions. In such a case, if the gap (e.g. loop) in the alignment is less than 30 residues, the entire AF2 model region is stored in the fragment library in addition to the aligned regions in the model. Similarly, if an aligned region is close to the N- or C-terminus of the protein and the aligned region of the AF2 model is extended up to 10 residues toward the terminus and stored in the library. A small, isolated alignment region is removed from the final alignment if it is shorter than five residues.

There are four hyper-parameters used up to this point of the protocol. They are the threshold $Θ$ (3 values) of the Cα probability, the number of vehicles M (3 and 4 values for single- and multi-chain targets, respectively), the number of different sequence alignment considered for a Cα path $Ψ$ (2 and 3 values for single- and multi-chain targets, respectively), and the length of the fragment alignment to consider when matching with the AF2 model $Λ$ (2 values). For each of the parameter combinations, we constructed a separate Cα fragment library, from which a Cα model is built in the subsequent step. The number of libraries for single-chain protein targets was $3 (Θ) x 3 (M) x 2 (Ψ) = 18$ , which do not include AF2 fragments and $3 (Θ) x 3 (M) x 2 (Ψ) x 2 (Λ) = 36$ , which include AF2 fragments, thus, in total of 54 = 18 + 36. For multi-chain targets, the number of libraries was $3 (Θ) x 4 (M) x 3 (Ψ) = 36$ , which do not include AF2 fragments and was $3 (Θ) x 4 (M) x 3 (Ψ) x 2 (Λ) = 72$ with AF2 fragments, thus 108 = 36 + 72 in total.

Assembling Cα fragments using Constraint Programming (CP) Solver

For each Cα library, a Cα model is constructed by combining fragments with a CP Solver, which finds solutions for complex combination problems with constraints. Fragments are selected under the following constraints: Let $s_{n} (n = 1, \dots, N_{fragment})$ be a DAQ score of each Cα fragment $f_{n}$ . The objective is to maximize

\sum_{n = 1}^{N_{fragment}} r_{n} s_{n}

(Eq. 8)

subject to

r_{n} \in {0, 1}, n = 1, \dots, N_{fragment}

Under the constraint that prohibits the combination of two Cα fragments $f_{i}$ and $f_{j}$ , i.e.

r_{i} + r_{j} \leq 1

(Eq. 9)

if $f_{i}$ and $f_{j}$ are an inconsistent pair, i.e. which is defined as a pair either (1) with a steric collision (< 2.5 Å for any Cα pair) or (2) with the same sequence position but placed at positions over 3.5 Å apart, or (3) with an inconsistent Cα− Cα separation between $f_{i}$ and $f_{j}$ that are placed at a distance larger than (the sequence separation) * 5 Å (Supplementary Fig. 1).

When $N_{fragment}$ is larger than 10,000, the CP Solver requires a large memory space and an extremely long computational time. To avoid a large computational cost for such a case, we split the Cα fragment library into randomly selected subsets with less than 10,000 fragments and applied each subset separately and iteratively to the CP Solver.

Fitting the AF2 models to the density map by VESPER

In addition to extracting fragments from the AF2 model, we also fit the entire AF2 model to the map using VESPER²¹, an EM map alignment program. We generated a simulated density map for the AF2 model at three different resolutions (5.0, 6.0, and 7.0 Å) using e2pdb2mrc.py⁴¹, and each resulting maps was superimposed onto the target EM map using VESPER. Ten best superimpositions were generated for each simulated map, which result in 30 superimpositions of the AF2 model on the EM map.

Combining Assembled Cα fragment models

Assembled Cα fragment models undergo another round of fragment split and assembly process to combine different models. This process aims to fix potential problems, if any, such as gaps in the structure, imperfect overlaps of neighboring fragments, or errors in sequence assignments. At this point we have 108 Cα fragment-based models and 30 superimposed AF2 model to the map.

These models are classified into the following four groups: Group 1, models built by assembling Cα fragments obtained from tracing the EM map by DeepMainmast(base); Group2, models built from the extended Cα fragment libraries with those originating from DeepMainmast(base) and the AF2 model; Group3, models that are included in both Group 1 and Group 2; and Group 4: superimposed AF2 models to the EM density map by VESPER. For each of the four groups, models were cut into overlapping ten-residue-long fragments, from which a single Cα model is constructed by the CP Solver using the same condition and constraints. This process generates four models, one each from each group.

Chain ID assignment for homo-oligomer targets

Assigning chain IDs can be challenging for homo-oligomer cases, where different chains may have identical sequences. We run the CP solver with a specific objective function designed for optimizing chain assignment. If a Cα model of a homo-multimer has errors in the chain ID assignment, the chain ID assignment is swapped between equivalent local sequence regions of different chains (Extended Data 9a). To start with, we define chain fragments along each chain model by checking the distance between adjacent residues. If adjacent amino acids are more than 10 Å apart at their Cα - Cα positions, we define the former and the latter parts from the gap as two separate chain fragments. Also, if residue numbers are discontinuous in a chain model, i.e. if a local region is missing from the model structure, the two parts are considered to be different chain fragments.

The target function to optimize by the CP solver consists of two terms. The first term is to maximize the sum of DAQ scores from each chain fragment to ensure the overall good fit to the EM map. The second term is a penalty term to be minimized to resolve local structure inconsistencies between chains:

\sum_{n = 1}^{N_{fragment}} r_{n} s_{n} - w \sum_{(c_{k}, c_{l})} \sum_{(i, j) \in C} e (|d_{c_{k}} (i, j) - d_{c_{l}} (i, j)|)

(Eq. 10)

where $N_{fragment}$ is the number of chain fragments in the homo-oligomer structure model, $s_{n} (n = 1, \dots, \cdot N_{fragment})$ is DAQ score of the n-th fragment, and $r_{n} \in {0, 1}$ , $n = 1, \dots, N_{fragment}$ , which indicates inclusion or exclusion of the n-th fragment in the oligomer model. In the second term, $w$ is a weight, set to $w = 0.01$ , to balance the two terms in the target function. The first summation is to add up the penalty scores from every pair of chain combinations, $c_{k}$ and $c_{l}$ , and the second summation is to add up the penalty term $e$ for every pair of amino acid residues, $i$ and $j$ , which satisfy the following condition, $C : \{(i, j) ∣ (i, j) \in C o m m o n A A (c_{k}, c_{l}), f_{k} (i) \neq f_{k} (j), f_{l} (i) \neq f_{l} (j)\} . C o m m o n A A (c_{k}, c_{l})$ a set of amino acid residue position pairs that exist both in chain $c_{k}$ and $c_{l}$ . $f_{k} (i)$ is a function that tells the fragment ID $n (1, \dots, N_{fragment})$ where the residue $i$ of the chain $k$ locates. Thus, the penalty term $e (|d_{c_{k}} (i, j) - d_{c_{l}} (i, j)|)$ is considered only when two amino acids $i$ and $j$ locate in different chain fragments in both chain chain c_k and c_l. In the penalty term e, $d_{c_{k}} (i, j)$ is the Euclidian distance of residue $i$ and $j$ in the chain $c_{k .} e = 1$ if $|d_{c_{k}} (i, j) - d_{c_{l}} (i, j)| > 2.0$ Å, and 0 otherwise. The target function (Eq. 11) is optimized subject to

\{\begin{matrix} r_{n} \in {0, 1}, (i. e. to include or exclude chain fragment n), \\ g (n) \in {k, l,,,}, where k, l are chain IDs (i. e. chain ID assignment for chain fragment n) \end{matrix} .

(Eq. 11)

for $n = 1, \dots, N_{f r a g m e n t} · g (n)$ is the function that assigns the chain fragment to a chain (Extended Data 9b).

Model evaluation by DAQ score and DOT score

We used the DAQ score and DOT score to evaluate the generated models. To compute the DOT score, we first computed a simulated density map from the full-atom model using e2pdb2mrc.py in the EMAN2 package at a 5 Å resolution. The grid spacing of the model’s simulated map and the target EM density map is converted to 2.0 Å, and then a unit vector at computed for each grid point $g_{i} (i = 1, \dots, N_{g r i d})$ . We use the density threshold value of 10.0 and 0.01 for the simulated map and the target EM map, respectively. In the map, the unit vector $\vec{u_{i}}$ at $g_{i}$ , which points toward the local dense point, $y_{i}$ is defined as

\vec{u_{l}} = \frac{y_{i} - g_{i}}{|y_{i} - g_{i}|}

(Eq. 12)

where y_{i} = \frac{\sum_{n = 1}^{N_{grid}} k (g_{i} - g_{m}) θ_{n} g_{n}}{\sum_{m = 1}^{N_{grid}} k (g_{i} - g_{m})}

(Eq. 13)

$θ_{n}$ is the density value at the grid point $g_{n}$ , k(p) is a Gaussian kernel function that is defined as:

k (p) = e x p (- 1.5 {|\frac{p}{τ}|}^{2})

(Eq. 14)

Where $τ$ is a bandwidth and set to 8.0. Then, the DOT score, which is the sum of the agreement of unit vector pair, $\vec{u_{l}}$ from the map and $\vec{u_{l}^{'}}$ from the simulated map from the protein model is

D O T = \frac{\sum_{l = 0}^{N_{grid}} (\vec{u_{l}} \cdot \vec{u_{l}^{'}})}{N_{grid}}

(Eq. 15)

$N_{grid}$ is the number of grid points in the map.

The final score used to rank full-atom models is a simple sum of the normalized DAQ score and DOT score.

Computational Time

The computational time is discussed in Extended Data 10.

Code availability

The source code of DeepMainmast is made available at https://github.com/kiharalab/DeepMainMast. The web server is available at https://em.kiharalab.org/algorithm/DeepMainMast. It can also run on Google Colab notebook webserver without installing it in a local machine at https://github.com/kiharalab/DeepMainMast/blob/main/DeepMainMast.ipynb. Capsules are prepared at CodeOcean at https://codeocean.com/capsule/9358532.

VESPER used in the pipeline is available at https://github.com/kiharalab/VESPER. DAQ is available at https://github.com/kiharalab/DAQ.

Extended Data

Extended Data Fig. 3 — These plots are related to main Fig. 2d. In Fig. 2d, we plotted the amino acid matching accuracy, which is the fraction of correctly modeled amino acids in each target. If an amino acid is not modeled correctly, in our definition, it should be either the Cα position itself is not within 3 Å in the first place or a case that the amino acid type was incorrectly assigned to correctly identified Cα position. Here, we examined these two reasons of amino acid matching errors. Thus, for each target, 1 – amino acid matching accuracy (AA Match) = (Fraction of cases with an incorrect Cα position) + (Fraction of cases of an incorrect AA type assignment). a, the fraction of cases of incorrect AA type was plotted relative to the AA match accuracy. The line shows y = −x + 1.0, the maximum possible value for the incorrect AA type relative to the AA match. b, the y-axis shows (Incorrect AA Type)/(1 – AA match). The average of this value for all the targets is 0.35. For relatively easy targets where AA Match > 0.95, the fraction of Incorrect AA Type was 0.11, indicating that DeepMainmast did not make much incorrect AA type assignments and most of the errors come from incorrect Cα position detection. For targets when AA match < 0.95, the fraction of incorrect AA type was 0.44.

Extended Data Fig. 4 — DeepMainmast(base) produced one Cα model for each target. For the Cα model, we ran Rosetta-CM, which fills missing regions (if any) and relax the structure, which produces 5 models. Out of them, we used the combination of the DOT score and the DAQ score, the same protocol as used in the final model selection step in the DeepMainmast pipeline (Fig. 1), to select the model for comparison. As for full-atom models for MAINMAST, following the MAINMAST protocol, MDFF was used to generate 500 full-atom models from one Cα model, among which the top-scoring full atom model was selected. a, the Cα coverage. b, the amino acid matching accuracy. c, TM-Score. d, the length of aligned regions between the model and the native structure by TM-align. These regions were used to compute RMSD in panel e, Cα RMSD of protein models.

Extended Data Fig. 5 — a and b, We used 33 targets in 178 experimental dataset. Details of computing local resolution is provided in Supplementary Table 10_Local_Resolution. We analyzed the models generated by DeepMainmast(base) and DeepMainmast without applying the last full atom model construction step. a, the accuracy of Cα atom positions of DeepMainmast(base) models (blue) and DeepMainmast models (orange). The bars and values represent the fraction of Cα atoms in models that are correctly positioned within 3 Å. The black line indicates the number of Cα atoms at the local resolution. b, The accuracy of amino acid type assignment by DeepMainmast(base) (blue) and DeepMainmast protocols (orange). The accuracy is defined as the fraction of Cα atoms in a model which are placed within 3 Å to the correct position and have the correct amino acid type assignment. c, the accuracy of Cα positions based on the secondary structure types. Orange and blue dots represent the results of DeepMainmast(base) and DeepMainmast for 33 targets, respectively. The secondary structures were computed by DSSP. The secondary structure types of G, H, and I are assigned as ‘Helix’. The secondary structure types of E and B are assigned as ‘Strand’. The secondary structure types of S, T, and B are assigned as ‘Loop’. The regions within the loop with a relative accessible surface area (ASA) greater than 10% were classified as ‘Flex’ (flexible). The values of minima, maxima, mean, median, bounds of box and whiskers of different categories in order: DeepMainmast(base), Helix (0.07, 1.00, 0.93, 1.00, 0.97/1.00, 0.94/1.00), Strand(0.25, 1.00, 0.90, 1.00, 0.96/1.00, 0.90/1.00), Loop (0.24, 0.98, 0.82, 0.91, 0.83/0.95, 0.76/0.98), and Flex (0.24, 0.98, 0.82, 0.91, 0.82/0.95, 0.71/0.98). DeepMainmast, Helix (0.03, 1.00, 0.94, 1.00, 1.00/1.00, 1.00/1.00), Strand (0.27, 1.00, 0.91, 1.00, 1.00/1.00, 1.00/1.00), Loop (0.21, 1.00, 0.86, 0.93, 0.86/0.97, 0.73/1.00), and Flex (0.19, 1.00, 0.91, 0.92, 0.86/0.96, 0.72/1.00). d, the accuracy of amino acid type assignment based on the secondary structure types. The values of minima, maxima, mean, median, bounds of box, and whiskers of different categories in order: DeepMainmast(base), Helix(0.44, 1.00, 0.95, 1.00, 0.94/1.00, 0.86/1.00), Strand(0.40, 1.00, 0.89, 0.99, 0.86/1.00, 0.71/1.00), Loop(0.31, 1.00, 0.89, 0.93, 0.83/0.99, 0.70/1.00), and Flex(0.36, 1.00, 0.89, 0.93, 0.82/1.00, 0.70/1.00). DeepMainmast, Helix (0.17, 1.00, 0.95, 1.00, 1.00/1.00, 1.00/1.00), Strand (0.60, 1.00, 0.96,1.00, 0.96/1.00, 0.94/1.00), Loop (0.45, 1.00, 0.92, 0.96, 0.89/1.00, 0.77/1.00), and Flex (0.44, 1.00, 0.91, 0.95, 0.88/1.00, 0.77/1.00). c and d, The bold numbers shown are the average values across all the targets. In this box plot, the center line, the bottom, and the ceiling in a box show the median, first quartile, and third quartile values, respectively. The boundaries of whiskers show 1.5 of the distance between the upper and lower quartiles. Details are provided in Supplementary Table 11_SSanalysis.

Extended Data Fig. 6 — For the three target proteins shown in Fig. 4d, e, and f, all models generated at the three major steps were evaluated in terms of TM-score. a. Models for PDB 3J9S chain A (Fig. 4d). b. Models for PDB 3J9C chain A (Fig. 4e). c. Models for PDB 5V6P chain A (Fig. 4f). In each panel, blue, orange, and green box plots show TM-Score distribution of the models generated by the ‘Assembling Ca Fragments’, ‘Combining Models,’ and ‘Building Full-Atom Models & Refinement’ steps, in Fig. 1, respectively. In these three steps, 54, 4, and 20 models were respectively generated. Red circles represent the models generated by the DeepMainmast(base) protocol. Black circles represent the models additionally generated by DeepMainmast. In the box plots, the middle line in a box corresponds to the median, and the top and bottom ends of a box represent quartiles. The upper and lower whiskers represent 1.5 * the interquartile range. Black diamond represents the outlier from the whiskers. Details are provided in Supplementary Table 12_ModelingStep.

Extended Data Fig. 7 — a, the Cα coverage. b, the amino acid matching accuracy. c, TM-Score. d, the sequence identity at aligned positions. TM-Score and the sequence identities were computed by MMalign.

Extended Data Fig. 8 — The network architecture of Emap2sf (Emap to structural features), which is used to detect amino acid types, atom types, at each grid point in an input EM density map. a. the network architecture. The entire network is a 3D U-shape-based convolutional Network (UNet) with full-scale skip connections and deep supervisions. The numbers indicate the channel size of the corresponding layers. N is 20 for amino acid type detection UNet, N is 6 for atom type detection UNet. b. the encoder and the decoder blocks are shown. The encoder block (Enc in panel a; the decoder block (Dec in panel a). Conv3D, a 3-dimentional (3D) convolutional layer with the filter size of 3*3*3, stride 1 and padding 1. BatchNorm, a normalization layer that takes statistics in a batch to normalize the input data. ReLU, Rectified Linear Unit, a commonly used activation layer.

Extended Data Fig. 9 — a. Example of chain ID assignment for EMD-5925. The deposited model (PDB ID 3J6J) consists of homo octamer structure. All models were colored by chain ID. The magnified images highlight a region where different chains interact. The left column shows the deposited model (PDB 3J6J) of EMD-5925. The middle column shows the model generated by the DeepMainmast(base) protocol prior to the chain ID assignment step. The right column shows the DeepMainmast(base) model after the chain ID assignment step is completed. As shown, chains are correctly connected and identified. b. Illustration of the chain ID assignment for a homo-dimer target. In this example, two five-residue-long models with different chain IDs (green: chain A, and blue: chain B) are shown. Numbered circles represent amino acid residues and the number is the residue number in the sequence. For the chain ID assignment, DeepMainmast maximizes the object function (Eq. 11) that consists of a DAQ score term and a penalty term. The penalty term is intended to have similar structures in different chains of homo-oligomers. On the left and right columns, we illustrated the computation of the penalty term e() before and after the chain ID assignment, respectively. Arrows between two residues with solid lines indicate the distance between residues i and j for chain A and B models (d_A(i,j) and d_B(i,j)). If the |d_A(i,j) - d_B(i,j)| > 3.0 Å, e() = 1. Since the models of chain A and B has different structures, all penalties e() is 1 before the chain ID assignment. After the chain ID assignment, all penalties e() were reduced to zero. b.

Extended Data Fig. 10 — a. Computational time of the DeepMainmast protocol on the dataset of 178 single-chain targets. b. Computational time of the DeepMainmast protocol on the dataset of 20 multi-chain targets. For this experiment, we used one GPU card (Nvidia GeForce GTX 1080Ti, 12GB memory) and four threads on one CPU (Intel Xeon CPU E5–1650 v4). The plots show the computational time in three colors (blue, orange, and green), corresponding to the (1) the steps up to the ‘Assembling Cα Fragment’, (2) the steps up to the ‘Combining Models’, and (3) the steps up to ‘building Full-Atom Models & Refinement’ in Fig. 1, respectively. The green solid lines represent the regression lines of the total computational time. GPU handles the deep learning process, while CPU performs the other steps. As shown in the figure, the combining model step, which uses CPUs, takes most of the time. It is generally proportional to the length of the protein but depends on the target protein structure. The required time is strongly influenced by the difficulty of modeling, that is more time is needed by the CP Solver for maps that need to explore a larger number of fragment combinations. On average, a single protein of up to ~500 residues can be modeled within a few hours. To speed up the process, we provided a multi-thread version of the code at the Github repository, which can use multiple CPUs simultaneously. Also, we provided a fast version, which only uses a limited number of parameter combinations and does not perform full-atom building and structure refinement. Details are provided in Supplementary Tables 13_CompTime_single and 14_CompTime_multi.

Supplementary Material

Supplementary File

NIHMS2126338-supplement-Supplementary_File.docx^{(4.9MB, docx)}

Supplementary tables

NIHMS2126338-supplement-Supplementary_tables.xlsx^{(85.1KB, xlsx)}

Acknowledgments

This work was partly supported by the National Institutes of Health (R01GM133840 and 3R01GM133840–02S1) and the National Science Foundation (CMMI1825941, MCB1925643, IIS2211598, DMS2151678, DBI2146026, and DBI2003635).

Footnotes

Competing interests

The authors declare no competing financial interest.

Data availability

Source data are made available in Supplementary Tables. The list of PDB and EMDB entries used in the benchmark datasets are available in Supplementary Table 1_Dataset, 4_Single_Model_Acc, 5_178targets and 6_MultiChain_results. The list of training and testing set are available in Supplementary Table 8_MultiChain_results and Table 9_Test_set.

References

1.Terashi G, Wang X, Maddhuri Venkata Subramaniya SR, Tesmer JJG & Kihara D Residue-wise local quality estimation for protein models from cryo-EM maps. Nature methods 19, 1116–1125, doi: 10.1038/s41592-022-01574-4 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Nakamura T, Wang X, Terashi G & Kihara D DAQ-Score Database: assessment of map-model compatibility for protein structure models from cryo-EM maps. Nature methods 20, 775–776, doi: 10.1038/s41592-023-01876-1 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Alnabati E & Kihara D Advances in Structure Modeling Methods for Cryo-Electron Microscopy Maps. Molecules 25, doi: 10.3390/molecules25010082 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Hryc CF & Baker ML Beyond the Backbone: The Next Generation of Pathwalking Utilities for Model Building in CryoEM Density Maps. Biomolecules 12, doi: 10.3390/biom12060773 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Terashi G & Kihara D De novo main-chain modeling for EM maps using MAINMAST. Nature communications 9, 1618, doi: 10.1038/s41467-018-04053-7 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Terashi G, Kagaya Y & Kihara D MAINMASTseg: Automated Map Segmentation Method for Cryo-EM Density Maps with Symmetry. Journal of chemical information and modeling 60, 2634–2643, doi: 10.1021/acs.jcim.9b01110 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Song Y et al. High-resolution comparative modeling with RosettaCM. Structure 21, 1735–1742, doi: 10.1016/j.str.2013.08.005 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Wang RY et al. Automated structure refinement of macromolecular assemblies from cryo-EM maps using Rosetta. Elife 5, doi: 10.7554/eLife.17219 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Zhang X, Zhang B, Freddolino PL & Zhang Y CR-I-TASSER: assemble protein structures from cryo-EM density maps using deep convolutional neural networks. Nat Methods 19, 195–204, doi: 10.1038/s41592-021-01389-9 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Pfab J, Phan NM & Si D DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proceedings of the National Academy of Sciences of the United States of America 118, doi: 10.1073/pnas.2017525118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.He J & Huang SY Full-length de novo protein structure determination from cryo-EM maps using deep learning. Bioinformatics, doi: 10.1093/bioinformatics/btab357 (2021). [DOI] [PubMed] [Google Scholar]
12.Terwilliger TC et al. Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard. Acta crystallographica. Section D, Biological crystallography 64, 61–69, doi: 10.1107/S090744490705024X (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Emsley P, Lohkamp B, Scott WG & Cowtan K Features and development of Coot. Acta Crystallogr D Biol Crystallogr 66, 486–501, doi: 10.1107/S0907444910007493 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Jumper J et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589, doi: 10.1038/s41586-021-03819-2 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Kryshtafovych A et al. Computational models in the service of X-ray and cryo-electron microscopy structure determination. Proteins 89, 1633–1646, doi: 10.1002/prot.26223 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
16.McCafferty CL, Pennington EL, Papoulas O, Taylor DW & Marcotte EM Does AlphaFold2 model proteins’ intracellular conformations? An experimental test using cross-linking mass spectrometry of endogenous ciliary proteins. Commun Biol 6, 421, doi: 10.1038/s42003-023-04773-7 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Hryc CF & Baker ML AlphaFold2 and CryoEM: Revisiting CryoEM modeling in near-atomic resolution density maps. iScience 25, 104496, doi: 10.1016/j.isci.2022.104496 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Terwilliger TC et al. AlphaFold predictions are valuable hypotheses, and accelerate but do not replace experimental structure determination. bioRxiv, 2022.2011. 2021.517405 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Dantzig GB & Ramser JH The truck dispatching problem. Management science 6, 80–91 (1959). [Google Scholar]
20.Perron L in International Conference on Principles and Practice of Constraint Programming. 2–2 (Springer; ). [Google Scholar]
21.Han X, Terashi G, Christoffer C, Chen S & Kihara D VESPER: global and local cryo-EM map alignment using local density vectors. Nature communications 12, 2090, doi: 10.1038/s41467-021-22401-y (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Huang H et al. in ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1055–1059 (IEEE; ). [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Carreira-Perpinan MA in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06). 1160–1167 (IEEE; ). [Google Scholar]
24.Rotkiewicz P & Skolnick J Fast procedure for reconstruction of full-atom protein models from reduced representations. Journal of computational chemistry 29, 1460–1465, doi: 10.1002/jcc.20906 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Pettersen EF et al. UCSF Chimera--a visualization system for exploratory research and analysis. Journal of computational chemistry 25, 1605–1612, doi: 10.1002/jcc.20084 (2004). [DOI] [PubMed] [Google Scholar]
26.Hoh SW, Burnley T & Cowtan K Current approaches for automated model building into cryo-EM maps using Buccaneer with CCP-EM. Acta Crystallogr D Struct Biol 76, 531–541, doi: 10.1107/S2059798320005513 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Terwilliger TC, Adams PD, Afonine PV & Sobolev OV A fully automatic method yielding initial models from high-resolution cryo-electron microscopy maps. Nature methods 15, 905–908, doi: 10.1038/s41592-018-0173-1 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Shekhar M et al. CryoFold: determining protein structures and data-guided ensembles from cryo-EM density maps. Matter 4, 3195–3216, doi: 10.1016/j.matt.2021.09.004 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Singharoy A et al. Molecular dynamics-based refinement and validation for sub-5 A cryo-electron microscopy maps. Elife 5, doi: 10.7554/eLife.16105 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Perez A, MacCallum JL & Dill KA Accelerating molecular simulations of proteins using Bayesian inference on weak information. Proceedings of the National Academy of Sciences of the United States of America 112, 11846–11851, doi: 10.1073/pnas.1515561112 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Allegretti M, Mills DJ, McMullan G, Kuhlbrandt W & Vonck J Atomic model of the F420-reducing [NiFe] hydrogenase by electron cryo-microscopy using a direct electron detector. eLife 3, e01963, doi: 10.7554/eLife.01963 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Bartesaghi A, Matthies D, Banerjee S, Merk A & Subramaniam S Structure of beta-galactosidase at 3.2-A resolution obtained by cryo-electron microscopy. Proceedings of the National Academy of Sciences of the United States of America 111, 11709–11714, doi: 10.1073/pnas.1402809111 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Hattne J et al. Analysis of Global and Site-Specific Radiation Damage in Cryo-EM. Structure 26, 759–766 e754, doi: 10.1016/j.str.2018.03.021 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Zhang Y & Skolnick J TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33, 2302–2309, doi: 10.1093/nar/gki524 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Wang X, Terashi G & Kihara D De novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Nature Methods (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Mukherjee S & Zhang Y MM-align: a quick algorithm for aligning multiple-chain protein complex structures using iterative dynamic programming. Nucleic Acids Res 37, e83, doi: 10.1093/nar/gkp318 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Lawson CL et al. EMDataBank unified data resource for 3DEM. Nucleic acids research 44, D396–403, doi: 10.1093/nar/gkv1126 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

Methods Only References

38.Stoyanov D et al. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings. Vol. 11045 (Springer, 2018). [Google Scholar]
39.Kingma DP & Ba J Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). [Google Scholar]
40.Siew N, Elofsson A, Rychlewski L & Fischer D MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 16, 776–785, doi: 10.1093/bioinformatics/16.9.776 (2000). [DOI] [PubMed] [Google Scholar]
41.Tang G et al. EMAN2: an extensible image processing suite for electron microscopy. Journal of structural biology 157, 38–46, doi: 10.1016/j.jsb.2006.05.009 (2007). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

NIHMS2126338-supplement-Supplementary_File.docx^{(4.9MB, docx)}

Supplementary tables

NIHMS2126338-supplement-Supplementary_tables.xlsx^{(85.1KB, xlsx)}

Data Availability Statement

[R1] 1.Terashi G, Wang X, Maddhuri Venkata Subramaniya SR, Tesmer JJG & Kihara D Residue-wise local quality estimation for protein models from cryo-EM maps. Nature methods 19, 1116–1125, doi: 10.1038/s41592-022-01574-4 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Nakamura T, Wang X, Terashi G & Kihara D DAQ-Score Database: assessment of map-model compatibility for protein structure models from cryo-EM maps. Nature methods 20, 775–776, doi: 10.1038/s41592-023-01876-1 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Alnabati E & Kihara D Advances in Structure Modeling Methods for Cryo-Electron Microscopy Maps. Molecules 25, doi: 10.3390/molecules25010082 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Hryc CF & Baker ML Beyond the Backbone: The Next Generation of Pathwalking Utilities for Model Building in CryoEM Density Maps. Biomolecules 12, doi: 10.3390/biom12060773 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Terashi G & Kihara D De novo main-chain modeling for EM maps using MAINMAST. Nature communications 9, 1618, doi: 10.1038/s41467-018-04053-7 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Terashi G, Kagaya Y & Kihara D MAINMASTseg: Automated Map Segmentation Method for Cryo-EM Density Maps with Symmetry. Journal of chemical information and modeling 60, 2634–2643, doi: 10.1021/acs.jcim.9b01110 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Song Y et al. High-resolution comparative modeling with RosettaCM. Structure 21, 1735–1742, doi: 10.1016/j.str.2013.08.005 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Wang RY et al. Automated structure refinement of macromolecular assemblies from cryo-EM maps using Rosetta. Elife 5, doi: 10.7554/eLife.17219 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Zhang X, Zhang B, Freddolino PL & Zhang Y CR-I-TASSER: assemble protein structures from cryo-EM density maps using deep convolutional neural networks. Nat Methods 19, 195–204, doi: 10.1038/s41592-021-01389-9 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Pfab J, Phan NM & Si D DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proceedings of the National Academy of Sciences of the United States of America 118, doi: 10.1073/pnas.2017525118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.He J & Huang SY Full-length de novo protein structure determination from cryo-EM maps using deep learning. Bioinformatics, doi: 10.1093/bioinformatics/btab357 (2021). [DOI] [PubMed] [Google Scholar]

[R12] 12.Terwilliger TC et al. Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard. Acta crystallographica. Section D, Biological crystallography 64, 61–69, doi: 10.1107/S090744490705024X (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Emsley P, Lohkamp B, Scott WG & Cowtan K Features and development of Coot. Acta Crystallogr D Biol Crystallogr 66, 486–501, doi: 10.1107/S0907444910007493 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Jumper J et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589, doi: 10.1038/s41586-021-03819-2 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Kryshtafovych A et al. Computational models in the service of X-ray and cryo-electron microscopy structure determination. Proteins 89, 1633–1646, doi: 10.1002/prot.26223 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.McCafferty CL, Pennington EL, Papoulas O, Taylor DW & Marcotte EM Does AlphaFold2 model proteins’ intracellular conformations? An experimental test using cross-linking mass spectrometry of endogenous ciliary proteins. Commun Biol 6, 421, doi: 10.1038/s42003-023-04773-7 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Hryc CF & Baker ML AlphaFold2 and CryoEM: Revisiting CryoEM modeling in near-atomic resolution density maps. iScience 25, 104496, doi: 10.1016/j.isci.2022.104496 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Terwilliger TC et al. AlphaFold predictions are valuable hypotheses, and accelerate but do not replace experimental structure determination. bioRxiv, 2022.2011. 2021.517405 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Dantzig GB & Ramser JH The truck dispatching problem. Management science 6, 80–91 (1959). [Google Scholar]

[R20] 20.Perron L in International Conference on Principles and Practice of Constraint Programming. 2–2 (Springer; ). [Google Scholar]

[R21] 21.Han X, Terashi G, Christoffer C, Chen S & Kihara D VESPER: global and local cryo-EM map alignment using local density vectors. Nature communications 12, 2090, doi: 10.1038/s41467-021-22401-y (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Huang H et al. in ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1055–1059 (IEEE; ). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Carreira-Perpinan MA in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06). 1160–1167 (IEEE; ). [Google Scholar]

[R24] 24.Rotkiewicz P & Skolnick J Fast procedure for reconstruction of full-atom protein models from reduced representations. Journal of computational chemistry 29, 1460–1465, doi: 10.1002/jcc.20906 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Pettersen EF et al. UCSF Chimera--a visualization system for exploratory research and analysis. Journal of computational chemistry 25, 1605–1612, doi: 10.1002/jcc.20084 (2004). [DOI] [PubMed] [Google Scholar]

[R26] 26.Hoh SW, Burnley T & Cowtan K Current approaches for automated model building into cryo-EM maps using Buccaneer with CCP-EM. Acta Crystallogr D Struct Biol 76, 531–541, doi: 10.1107/S2059798320005513 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Terwilliger TC, Adams PD, Afonine PV & Sobolev OV A fully automatic method yielding initial models from high-resolution cryo-electron microscopy maps. Nature methods 15, 905–908, doi: 10.1038/s41592-018-0173-1 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Shekhar M et al. CryoFold: determining protein structures and data-guided ensembles from cryo-EM density maps. Matter 4, 3195–3216, doi: 10.1016/j.matt.2021.09.004 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Singharoy A et al. Molecular dynamics-based refinement and validation for sub-5 A cryo-electron microscopy maps. Elife 5, doi: 10.7554/eLife.16105 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Perez A, MacCallum JL & Dill KA Accelerating molecular simulations of proteins using Bayesian inference on weak information. Proceedings of the National Academy of Sciences of the United States of America 112, 11846–11851, doi: 10.1073/pnas.1515561112 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Allegretti M, Mills DJ, McMullan G, Kuhlbrandt W & Vonck J Atomic model of the F420-reducing [NiFe] hydrogenase by electron cryo-microscopy using a direct electron detector. eLife 3, e01963, doi: 10.7554/eLife.01963 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Bartesaghi A, Matthies D, Banerjee S, Merk A & Subramaniam S Structure of beta-galactosidase at 3.2-A resolution obtained by cryo-electron microscopy. Proceedings of the National Academy of Sciences of the United States of America 111, 11709–11714, doi: 10.1073/pnas.1402809111 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Hattne J et al. Analysis of Global and Site-Specific Radiation Damage in Cryo-EM. Structure 26, 759–766 e754, doi: 10.1016/j.str.2018.03.021 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Zhang Y & Skolnick J TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33, 2302–2309, doi: 10.1093/nar/gki524 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Wang X, Terashi G & Kihara D De novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Nature Methods (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Mukherjee S & Zhang Y MM-align: a quick algorithm for aligning multiple-chain protein complex structures using iterative dynamic programming. Nucleic Acids Res 37, e83, doi: 10.1093/nar/gkp318 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Lawson CL et al. EMDataBank unified data resource for 3DEM. Nucleic acids research 44, D396–403, doi: 10.1093/nar/gkv1126 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

DeepMainmast: Integrated Protocol of Protein Structure Modeling for Cryo-EM with Deep Learning and Structure Prediction

Genki Terashi

Xiao Wang

Devashish Prasad

Tsukasa Nakamura

Daisuke Kihara

Abstract

Introduction

Results

Modeling Protocol Workflow

Figure 1. Overview of DeepMainmast protocol.

DeepMainmast(base)

Predicting amino acid and atom types using deep learning

Tracing Cα paths and assigning the protein sequence

Integration of AF2 structure prediction models

Adding fragments from the AF2 model

Assembling Cα fragments to build protein models

Fitting the AF2 model to the EM density map by VESPER

Combining assembled Cα fragment models

Chain ID assignment for homo-multimer targets

Building full-atom models

Evaluating generated models

Results of DeepMainmast(base) on single-chain modeling

Figure. 2. Single-chain modeling results on the 29 EM map dataset.

Single-chain modeling on another dataset of 178 maps

Figure 3. Single-chain modeling results on the 178 experimental maps.

Examples of single-chain protein structure modeling

Fig. 4. Modeling examples of single-chain targets.

Multi-chain structure modeling

Fig. 5. Modeling results of 20 multi-chain protein complex targets.

Discussion

Online Methods

Detection of local structural properties in an EM map

Training and validation datasets of experimental maps for Emap2sf

Training the deep neural network of Emap2sf

Tracing Cα paths

Assigning sequence to Cα traces to generate Cα fragments

Adding fragments from the AF2 model to the Cα fragment library

Assembling Cα fragments using Constraint Programming (CP) Solver

Fitting the AF2 models to the density map by VESPER

Combining Assembled Cα fragment models

Chain ID assignment for homo-oligomer targets

Model evaluation by DAQ score and DOT score

Computational Time

Code availability

Extended Data

Extended Data Fig. 1. Three types of constraints in the step of combining Cα fragments with the CP solver.

Extended Data Fig. 2. Modeling accuracy of DeepMainmast(base) and CryoFold for the 29-map dataset.

Extended Data Fig. 3. Analysis of incorrect amino acid type assignment in single-chain modeling results of DeepMainmast(base).

Extended Data Fig. 4. Modeling accuracy of full-atom models for the 29-map dataset.

Extended Data Fig. 5. Modeling accuracy relative to local resolution and local structures.

Extended Data Fig. 6. TM-Score distribution of models generated at major steps in the DeepMainmast protocol.

Extended Data Fig. 7. Modeling results of 20 multi-chain protein complex targets.

Extended Data Fig. 8. The network architecture of the deep learning method for local structure detection.

Extended Data Fig. 9. Chain ID assignment in the DeepMainmast protocol.

Extended Data Fig. 10. The computational time of DeepMainmast.

Supplementary Material

Acknowledgments

Footnotes

Data availability

References

Methods Only References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases