Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2023 Jul 31;19(7):e1011255. doi: 10.1371/journal.pcbi.1011255

Gentle and fast all-atom model refinement to cryo-EM densities via a maximum likelihood approach

Christian Blau 1, Linnea Yvonnesdotter 2, Erik Lindahl 1,2,*
Editor: Jochen Hub3
PMCID: PMC10427019  PMID: 37523411

Abstract

Better detectors and automated data collection have generated a flood of high-resolution cryo-EM maps, which in turn has renewed interest in improving methods for determining structure models corresponding to these maps. However, automatically fitting atoms to densities becomes difficult as their resolution increases and the refinement potential has a vast number of local minima. In practice, the problem becomes even more complex when one also wants to achieve a balance between a good fit of atom positions to the map, while also establishing good stereochemistry or allowing protein secondary structure to change during fitting. Here, we present a solution to this challenge using a maximum likelihood approach by formulating the problem as identifying the structure most likely to have produced the observed density map. This allows us to derive new types of smooth refinement potential—based on relative entropy—in combination with a novel adaptive force scaling algorithm to allow balancing of force-field and density-based potentials. In a low-noise scenario, as expected from modern cryo-EM data, the relative-entropy based refinement potential outperforms alternatives, and the adaptive force scaling appears to aid all existing refinement potentials. The method is available as a component in the GROMACS molecular simulation toolkit.

Author summary

Cryo-electron microscopy has gone through a revolution and now regularly produces data with 2Å resolution. However, this data comes in the shape of density maps, and fitting atomic coordinates into these maps can be a labor-intensive and challenging problem. This is particularly valid when there are multiple conformations, flexible regions, or parts of the structure with lower resolution. In many cases it is also desirable to to understand how a molecule moves between such conformations. This can be addressed with molecular dynamics simulations using densities as target restraints, but the refinement potentials commonly used can distort protein structure or get stuck in local minima when the cryo-EM map has high resolution. This work derives new refinement potentials based on models of the cryo-EM scattering process that provide a gentle way to fit protein structures to densities in simulations, and we also suggest an automated heuristic way to balance the influence of the map and simulation force field.

1 Introduction

Cryo-electron microscopy (cryo-EM) has undergone a revolution the last few years due to better detectors, measurement techniques and algorithms [1], and the technique now allows for rapid reconstruction of biomolecular “density maps” at near-atomic resolution [2, 3]. These density maps describe interactions between the sample and an electron beam in real space. They (including similar ones derived from X-ray structure factors) provide the basis for reasoning in structural biology. In particular, for cryo-EM, Bayesian statistics has revolutionized the reconstruction of the density maps from micrographs. This provides a framework to soundly combine prior assumptions about the three-dimensional density map model with the likelihood function that connects this model to the measured data and determine the density map most likely to have generated the observed data instead of directly trying to solve the underdetermined inverse problem [4].

However, to understand the structure and function of biological macromolecules, merely having an overall cryo-EM density is typically not sufficient—it is also necessary to model coordinates of individual atoms into the maps [5]. This enables understanding of e.g. binding site properties, interactions with lipids or other subunits, structural rearrangements between alternative conformations, and in particular it makes it possible to model structural dynamics on nanosecond to microsecond time scales via molecular dynamics (MD) simulation [6]. If the interaction descriptions (force fields) used in these simulations were perfect and one had access to infinite amounts of sampling, computational methods should be able to further improve the structure just by starting refinement from a rough initial density, but in practice both force fields and sampling have shortcomings. Nevertheless, it remains an attractive idea to combine the best of both worlds by using cryo-EM data as large-scale constraints while force fields are employed to fine-tune details—in particular details such as local stereogeometry or interactions on resolution scales that go beyond what the cryo-EM data can resolve. Cryo-EM data and stereochemical constraints have been combined favorably in the past to aid structure modeling into three-dimensional cryo-EM densities either by adding force field terms enforcing desired stereochemistry to established modeling tools [711] or by adding a heuristic density-based biasing potential to molecular dynamics simulations [12, 13] or elastic network models [14].

In practice, it is not straightforward how to best combine experimental map data with simulations and achieve both satisfactory results and rapid convergence. Density-based biasing potentials can in principle achieve arbitrarily good fits to a map, but it comes at a cost of distorting the protein structure. To address the challenges of balancing desired stereochemical properties with cryo-EM data, refinement protocols have been expanded to include secondary structure restraints [12], multiple resolution ranges [11, 15], as well as multiple force constants [11, 16]; the latter two either consecutively in individual simulations [11] or via Hamiltonian replica exchange [15, 16]. A common challenge of all these approaches is the increase in the ruggedness of the applied bias potential function as the resolution of the cryo-EM density maps increases, and how to correctly balance molecular mechanics and forces from the biasing potential. This leads to an apparent modeling paradox that further improving structural models for cryo-EM densities with molecular dynamics appears to be harder the more high-resolution data is available for these models.

To attack this challenge from a fundamental standpoint, Bayes approach has been used to derive probabilities for all-atom structural models given a cryo-EM density [17] and to weigh cryo-EM data influence against other sources of data [18]. These modeling approaches offer valuable insight into the data content in cryo-EM maps and provide promising new ways to model cryo-EM densities by treating them as generic experimental data. However, they do not reflect the underlying physics of data acquisition and density reconstruction from micrographs and have previously not yielded refinement potentials of a new quality to be applicable, e.g., in molecular dynamics simulations.

One way of circumventing the number of model assumptions that are necessary to reflect the reconstruction of three-dimensional cryo-EM densities is to employ Bayesian models that directly connect micrographs and all-atom ensembles [19]. These attempts have previously proven to be prohibitively costly as a way to derive driving forces for molecular dynamic simulation because projections of model atom coordinates onto millions of cryo-EM particle images (i.e., images of molecules) are required for a single force evaluation.

In this work, we show how it is possible to borrow the highly successful approach to density reconstruction and use maximum likelihood modeling of cryo-EM density maps from structures to derive a new biasing potential that is smooth, long-ranged, and provides fewer barriers to refinement than established potentials based on cross-correlation [11, 13] or inner product (equivalent to using potentials proportional to inverted cryo-EM density [12]). This provides a number of advantages, including an ability to overcome density barriers and in particular avoid excessive biasing forces resulting from large local gradients in cryo-EM density maps. It also avoids the need for constraints e.g. on secondary structure and rather allows the simulation to freely explore local conformational space, while the experimental data is used to bias sampling to experimentally favored regions of the global conformational space.

We further demonstrate how better balancing between the force field and cryo-EM density components can be achieved by adaptive force scaling derived from thermodynamic principles. This allows refinement with a single fixed parameter at low computational cost for a range of system sizes and initial model qualities. Additionally, to evaluate biasing potentials based on model to cryo-EM density comparison, we suggest a transformation of all-atom structure to model density that reflects cryo-EM specific characteristics while minimizing computational effort.

We investigate advantages and drawbacks of the newly derived potential in practical applications when compared to established inner-product and cross-correlation biasing potentials in a noise-free and experimental cryo-EM data. Finally, we show how the proposed refinement methods rectify a distorted initial model with cryo-EM data. A full open-source implementation is freely available as part of the GROMACS molecular dynamics simulation code [20].

2 Results

2.1 Deriving refinement forces

Our algorithm to refine all-atom models into a cryo-EM density map ρ with molecular dynamics is based on initially roughly aligning density map and target structure, generating a model density from coordinates x, and then determining a fitting potential Ufit based on a comparison of the generated model density and the target cryo-EM density (Fig 1). This potential is used to derive fitting forces, which are then combined with the force field potential Uff based on a heuristic balance between the density-derived forces and force field determined by a force constant k.

Fig 1. Atomic structure models are refined into a cryo-EM density using biasing forces that maximize similarity between model and map.

Fig 1

A refinement/simulation is initialized with an atomic model (orange) and a density map (blue). A model density is generated in each voxel (grey boxes). Voxel-wise similarity scores between model density and cryo-EM density are akin to a noise model (light blue curve). The gradient of the similarity score determines the fitting forces (blue arrows). Together with a molecular dynamics force field (red arrows), the fitting forces enable model coordinate updates (dark orange) that make the model more similar to the density under force field constraints. New model densities are generated iteratively from the updated model in each time step of the simulation until acceptable convergence is reached.

The combined driving forces are determined by the total potential energy,

Utot(x,ρ)=Uff(x)+Ufit(x,ρ). (1)

Assuming that a single configuration of atoms gives rise to the observed cryo-EM density, Bayes’ theorem quantifies the probability density that the given model describes the cryo-EM data as [17]

p(x|ρ)p(x)p(ρ|x). (2)

Boltzmann inversion at temperature T connects the left-hand sides of Eqs (1) and (2) [21] where c is an arbitrary potential energy offset

logp(x)=-1kBTUff(x)+c (3)
logp(ρ|x)=1kBTUfit(ρ,x)+c. (4)

In this formalism, the force field provides the prior p(x) that would have determined the model without any additional observations, while the fitting potential provides the conditional probability that a particular given structure yields a target density p(ρ|x).

Splitting the fitting potential into a coordinate and density independent force-constant k and a similarity score S(ρ,x)-Ufit/k reveals the similarity to previous approaches [8, 12]. We find that applied density forces imply a similarity score between the model structure and target density, and vice-versa. With this ansatz, Eq (4) relates similarity measures like cross-correlation or inner-product to their implicit assumptions about the likelihood function above, and in turn enables the construction of new similarity scores that drive refinement procedures depending on the assumptions about the underlying measurement process.

2.2 Maximum likelihood model yields negative relative entropy as similarity score

To derive a new refinement potential from the likelihood of measuring a density given the structure, we assume that cryo-EM densities are linked to atom-electron scattering probabilities, where electron-atom interaction leads to a phase shift in electrons. With this assumption, two steps are necessary to calculate the density likelihood p(ρ,x) from coordinates. First, an electron-scattering probability density ρs is created from a given structure. Second, this density is compared to a given measured density. Two further assumptions enable the derivation of new similarity scores.

For the per-voxel scattering probability, we present two different sets of assumptions, leading each to a refinement potential in their own right. Many more assumptions may be laid out; here we choose to present those that go beyond previous modeling, yet are well-treatable in model complexity and integratable into a molecular dynamics framework where forces have to be calculated numerically stable and fast. Therefore, we choose not to integrate the full image formation from the microscope detector to three-dimensional density but rather start the modeling process with a three-dimensional density and some additional assumptions on what each voxel represents. By presenting two different approaches to what voxel values in a density present, it should become even clearer how to adapt our modeling to more complex models.

In the first model we assume that an incident electron will scatter at exactly one voxel and that the probability at which voxel to scatter is proportional to the density values. This requires ∑v ρ = 1, which we achieve via rescaling the density values. The free rescaling of cryo-EM density values is motivated from the normalization to unity variance around the particle region during image processing [22]. Introducing the expected number of density interactions, s, the scattering process is described via a Dirichlet distribution,

pI(ρ|x,s)=Γ(vsρvs)vΓ(sρsv)vρvsρvs-1. (5)

With some approximations (S1 Appendix), we find

logpI(ρ|x,s)=svvoxelsρvslogρv. (6)

This newfound potential relates to the traditionally used refinement potentials by defining the similarity score

SI(ρ,ρs)-vvoxelsρvslogρv. (7)

Using this definition we observe that the scaling parameter s and force constant are related via k = kBTs.

Even though s can be estimated from a Bayes’ approach with a conditional posterior estimate on s with a prior p(s), to obtain the likelihood to observe a density, given coordinates x (see S1 Appendix), we choose a different approach for the following reason. Contributions to this conditional posterior are exponentially weighted with SI, so that a good estimate for the parameter depends on the ability to create a sufficiently representative distribution of x similar to the structure. In most cases, however, we expect that we need density-guided simulations to force molecules away from initial configurations over significant energy barriers to be able to sample the relevant configurations that contribute to the estimate of s. To overcome these issues, we heuristically scale s with a protocol described below. This allows us to generate a trajectory with structures with ever-increasing likelihood of good structure-to-map overlap, but admittedly at the cost of not sampling from a proper posterior distribution.

In this Dirichlet distribution-based picture the reported density is treated as a probability density, requiring the removal of negative values and normalization to unity. The resulting potential of this modeling approach is proportional to the Kullback-Leibler divergence between simulated and experimental density with a free scaling parameter. This potential in turn can be seen as an inner-product based potential where density is replaced by its logarithm.

In an alternative picture, reported cryo-EM densities at each voxel are proportional to interaction counts v of N incident electrons, where the scaling factor r is undetermined as mentioned above. We assume that measured scattering probabilities per voxel are independent of other voxel values. This does not exclude spatial correlation between density data but states that the scattering process in one voxel does not influence the electron interaction in other voxels. With this result, it suffices to define a probability distribution at each voxel.

This picture assumes that vitreous ice does not contribute to the scattering, which is commonly achieved by shifting the offset of cryo-EM densities so that water density is represented with voxel values that fluctuate around zero. Only accounting for positive density, we describe this scattering interaction process by a Poisson distribution with parameter λ=Nρvs. While it is theoretically possible to expand the model to include noise fluctuations and negative densities, we omit this for the sake of reducing model complexity.

With these assumptions (detailed algebraic transformations in S1 Appendix),

logpII(ρ|x)=vvoxelsrρvlog(ρvsrρv). (8)

In analogy to above, the unknown scaling factor r would be treated as a scale parameter and may be estimated from a conditional posterior distribution, but is left as a free parameter to be heuristically scaled here. We obtain a similarity score between simulated model density and cryo-EM density proportional to the negative relative entropy, or Kullback-Leibler divergence after normalizing ρ such that ∑v r1ρv = 1, with r = r1r2,

SII(ρ,ρs):=-vvoxelsr1ρvlog(ρvsr1ρv). (9)

Similarly to above, we find that with this similarity score definition, force constant and r2 are related by k = r2kBT (see S1 Appendix).

The newly derived relative-entropy-based similarity score has a domain of [−∞, 0] with perfect agreement at zero. Due to the logρvs term, it differs prominently from established similarity scores like cross-correlation [11] and inner-product (formulated as a force following the gradient of a smoothed inverted density which is equivalent in this approach [12]; see S1 Appendix). In contrast, the relative-entropy based score receives the largest contribution from voxels where cryo-EM data has no corresponding model density data.

This leads to a different behavior from established similarity scores with local minima for locally good agreement with cryo-EM data while the relative-entropy based potential will only have minima where there is good global agreement between structure and density. As a consequence, the relative-entropy based density potential is expected to perform better in situations where other potentials cannot escape local minima, at the cost of higher sensitivity to noise in the data, and especially additional density data that is not accounted for in the atomic model.

2.3 The potential energy landscapes based on relative entropy are smooth

The proposed relative entropy density-to-density similarity measure has favorable properties in one-dimensional model refinement of one and two particles to a reference density (Fig 2). Both newly derived potentials have the same analytical form for our one-dimensional model case.

Fig 2. Similarity score determines ruggedness of the effective refinement potential energy landscape, also when balancing it with structural bias.

Fig 2

From top to bottom: a One-dimensional refinement of a single particle (black circle) towards a Gaussian-shaped density (gray) with inner-product (purple), cross-correlation (ochre), relative-entropy swapped (dark blue) and relative-entropy (green) as similarity scores. b Expanded model with two particles (black circles, x1 smaller and x2 larger) with two amplitude peaks in a one-dimensional density and target distribution (gray), and the resulting two-dimensional effective potential energy landscapes for inner-product, cross-correlation, swapped relative-entropy and relative-entropy similarity measures. c Combination of the similarity measure and force field contribution to the potential energy landscape, exemplified by a harmonic bond that keeps particles at half the distance between the Gaussian centers. For all relative weights of the contributions of the refinement potential and bond potential energy landscape (ratio 1:2 upper panel, 2:1 middle panel, as illustrated by the scale on the left), the relative entropy similarity score produces smooth landscapes with minima at the positions that are expected from the model input.

In contrast to cross-correlation and inner product similarity measures that have a steep and sudden onset for refinement forces in one dimension, the relative-entropy similarity score has a harmonic shape with long-ranged interactions that allow for efficient minimization. Using relative-entropy, the particle to be refined is attracted by a harmonic spring-like potential to the best-fitting position; far away from the minimum forces are large, but their magnitude decreases monotonically as the minimum is approached. Inner-product and cross-correlation based fitting potentials, however, exert almost no force on the particle outside the Gaussian spread width, while exerting a suddenly increasing force when moving closer to the Gaussian center, and are only insignificant very close to the minimum.

For the refinement of two particles, this advantage is only maintained for one of the newly derived potentials, where the relative-entropy-based potential energy landscape is less rugged and has fewer pronounced features and minima than the corresponding landscapes for the inner-product and cross-correlation based potentials (Fig 2). Only a single diagonal barrier is found in the relative-entropy-based potential landscape, corresponding to a “swapping” of particle positions, which alleviates the search for a global minimum. The inner-product-based free energy landscape has its minimum at a configuration where both particles are at the same position at the highest density. This issue can be alleviated in practical applications through a force-field prior that would enforce a minimum distance between the atoms (e.g. through van der Waals interactions). The swapped relative entropy potential on the other hand exhibits behavior similar to the inner-product, with similar minima but an overall smoother energy landscape.

To model the influence of a force field, the two particles were connected with a harmonic bond with increasing influence. The balance between density-based forces and bond strongly determines the shape of the resulting energy landscape, but here too relative entropy provides a smoother landscape less sensitive to the specific relative weight of refinement and bond potentials.

2.4 Adaptive force scaling reduces work exerted during refinement and allows for comparison of density-based potentials

To enable convergence to high similarity between structure and densities without distorting secondary structure, we employ an adaptive force scaling heuristic. Established protocols where the force constant has to be determined manually require an iterative trial-and-error approach. We address this by introducing an adaptive force-scaling as depicted in Fig 3a to automatically balance force-field and density-based forces during the refinement.

Fig 3. Adaptive scaling of contributions from force-field and cryo-EM density data overcomes potential energy barriers without excessive work input.

Fig 3

a Adaptive force scaling heuristically balances force-field and density influence during refinement simulations. b Particle in energy landscape where density similarity increases from left to right along the black curve. For the upper leg alternative, the similarity decreases despite biasing forces (burgundy arrow), which causes the bias strength to be increased. Conversely, in a scenario where the similarity remains high (lower leg), the biasing force will gradually be reduced to allow the system to better sample the local landscape. c Brownian diffusion in a potential with fixed (grey) and adaptive (burgundy) biasing forces, respectively. The constant biasing force is scaled such that both force-adding schemes yield the same average mean first passage time moving from left to right. The relative-entropy approach leads to significantly lower exerted work on the system (area under the grey and burgundy curves, respectively), which reduces perturbation of the dynamics of the system.

Cryo-EM refinement simulations are non-equilibrium simulations with the aim to drive a system from an initial model state to a final state that is as similar to the cryo-EM density as possible while avoiding structural distortions that result e.g. from unphysical paths. To avoid or at least reduce the latter during refinement, a heuristic protocol has been devised that aims to minimize work exerted on the system while still requiring as little time as possible for the refinement.

With similarity S as reaction coordinate when driving the system from an initial fit Sstart to Send, the exerted work from the density-guided simulation potential is only determined by the variation of the force-constant as the system progresses:

Wfit=S=SstartSend-S(-kS)dS=S=SstartSendkdSframekframeΔSframe, (10)

Eq (10) shows that any protocol that decreases k if the ΔS > 0 and conversely increases k if ΔS < 0 decreases is guaranteed to exert less work to reach the same similarity score than keeping the force constant fixed at the final value of the adaptive scaling protocol, given Sstart < Send. Fig 3b shows how adaptive-force scaling implements this behaviour, starting from a low force constant k.

A one-dimensional Brownian diffusion model system (Fig 3c) is used to test the performance of the concrete scaling protocol as described in the methods section of this paper. In this model, the similarity score simply increases with increasing particle coordinate value. Biasing the system towards increasing coordinate values with adaptive force scaling in contrast to a constant force allows for the particle to reach a given coordinate value in the same average first-passage time at a much lower average work input. Without any coupling of the free energy landscape to the adaptive force scaling protocol other than through the particle trajectory, the adaptive force scaling increases the force just sufficiently to allow overcoming energy barriers but then reduces it again.

Instead of setting a limit to the adaptive force scaling heuristic, we chose to continue the force-scaling until the limits of the molecular dynamics integration algorithm are reached, with the benefit that trajectories are created with structures that represent a wide range of balances between stereo-chemistry and data.

Adaptive force scaling further enables the comparison of relative entropy to other established density-based potentials in simulations with cryo-EM data because it disentangles the effect of the force constant choice from the choice of refinement potential. To carry out this comparison on cryo-EM data with our newly derived similarity score within our framework, a model density generation protocol is required which is shown below.

2.5 Deriving an optimal model density generation for cryo-EM data refinement

To evaluate similarities between structural models and cryo-EM densities, a model electron scattering probability density is generated from atom positions. Two dominant effects are convoluted when modeling electron scattering probabilities: The scattering cross-section of each atom and their thermal motion. Both are approximated with Gaussian functions of amplitude A and width σ. The scattering cross sections determine A (S1 Table). For convenience, we approximate scattering amplitudes by unity for all heavy atoms and zero for hydrogens. The magnitude of thermal fluctuation of atoms at cryogenic temperatures determines the spread width σ.

In practice, these limitations to the model resolution are superseded by the finite performance of the measurement instrument and the reconstruction process where structural heterogeneity, detector pixel size, microscope lenses, and particle alignment limit the resolution. We do not account for structural heterogeneity, because it is an ensemble effect. A connection between the approach presented here and an ensemble model may be made though by employing a probability distribution p(x) instead of x in Eq (2) and leveraging ensemble simulations [23]. Other resolution-limiting effects are approximated by additional convolution of the generated maps with a Gaussian kernel. Rather than aiming to reproduce the same blur as in the experimental map, we strive to preserve as much information as possible from the physical model.

A balance between information loss due to under-sampling on the grid on the one hand and information loss due to coarse blurring is found where the Gaussian width at half maximum height equals the resolution. The maximum representable resolution on a grid corresponds to twice the Nyqvist frequency δ (corresponding to the pixel and voxel size) so that the Gaussian width σ is approximated in refinement simulations from the highest local resolution or, where that data is not applicable, from the voxel-size,

σ=resmax2log42δ2log4. (11)

For computational efficiency Gaussian spreading is truncated at 4σ for all simulations in this publication, accounting for more than 99.8% of the density (Fig A in S1 Appendix). The small limitation on the maximal distance between the initial model structure and the cryo-EM density through this cutoff has proven to be irrelevant for all practical purposes, as density-based forces will “pull” structures into densities as soon as there is minimal overlap between model density and cryo-EM density, which can easily be achieved with an initial alignment. Interestingly, this approach results in a smaller Gaussian spreading width than previously applied ones that aim to reproduce a density map with the same overall resolution as the experimental cryo-EM density. As a result, it maintains as much structural information as possible in the model density while still reducing the computational costs.

2.6 Refinement against noise-free data

To separate additional noise effects in experimental data and possible limitations in the above model, we first assess refinement with ideal data where a small straight helix model system [14] has been refined against a synthetically generated target density of the same helix in a kinked configuration. As illustrated in Fig 4a, adaptive force scaling and relative-entropy as similarity score efficiently fit the helix into the synthetic cryo-EM density [24].

Fig 4. Refinement into noise-free data with adaptive force scaling.

Fig 4

a Aligned (left) and unaligned (right) starting conformations (black sticks) of a helix subject to refinement simulation into a synthetically generated cryo-EM density (gray mesh). b RMSD per residue of the final refined models starting from the aligned conformation compared to the ground truth model underlying the synthetic density map. Each replicate (n=7) is colored by the similarity measure used, inner-product (purple), cross-correlation (ochre), relative-entropy swapped (dark blue), and relative-entropy (green). c RMSD per residue of the final refined models starting from the unaligned conformation compared to the ground truth model underlying the synthetic density map. Each replicate (n=7) is colored by the similarity measure used, inner-product (purple), cross-correlation (ochre), relative-entropy swapped (dark blue), and relative-entropy (green).

The combination of adaptive force scaling and different similarity scores achieved a consistent global fit when the helix was aligned to the density, with some fluctuations of the results (Fig 3c) due to the stochastic nature of molecular dynamics simulations. All four similarity measures lead to qualitatively similar evolution of the adaptive force constant (S1 Fig; the absolute value has no meaning since it is merely a relative factor describing the balance between force field and density fitting). Simulations starting from both the aligned and unaligned starting positions occasionally get stuck in local minima, which can result in bad fits. The average total RMSD of all replicates was lowest for relative-entropy starting from the aligned position and further improved when the helix was initially unaligned (S2 and S3 Tables). The relative-entropy based potential shows markedly better results for the unaligned refinement and achieved a fit with less 1Å RMSD in 6 out of 7 replicates while inner-product, cross-correlation, and relative-entropy swapped based potentials in some instances completely fail to align the helix (S2 and S3 Figs). For a single helix this is a slightly artificial case, but in a large structure undergoing significant transitions, it will be common for some secondary structure elements to not overlap with the target density in an initial phase of the refinement.

2.7 Refinement against experimental cryo-EM data

Experimental cryo-EM densities of aldolase and a GroEl subunit were used to test the performance of adaptive-force scaling in combination with different refinement potentials on experimental data with increasing amounts of density that is not accounted for in our model description and noise that cannot be fully accounted for by the model assumptions.

By using adaptive-force scaling refinement of a previously published X-Ray structure of rabbit-muscle-aldolase [25] against a recently published independently determined cryo-EM structure [26], we consistently achieve accurate refinement throughout all potentials with good stereochemistry (S4 and S5 Tables). Fig 5a shows the final models of refinement with a global deviation of less than 1Å heavy-atom root mean square deviation (RMSD) from the deposited model using inner-product and cross-correlation measures and just above 1Å for the relative-entropy based density potential.

Fig 5. Refinement of an all-atom X-Ray aldolase structure (PDB id 6ALD) into experimental cryo-EM density (EMD-21023).

Fig 5

a Final structure models from density-guided simulations using different similarity scores colored by unaligned root mean square coordinate deviation (RMSD) per residue from the manually built model (PDB id 6V20). b Fourier shell correlation of starting structure (gray line), rigid-body fit of the starting model to the target density (blue) as well as refinement results in the last simulation frame (solid lines). The reported cryo-EM map resolution and 0.143 FSC are indicated with grey lines. c Unweighted FSC average over the course of refinement simulation.

The close agreement with the cryo-EM data is reflected in the FSC of the models refined against the density (Fig 5b) being nearly indistinguishable from the deposited model. The relative-entropy-based potential emphasizes agreement with global features at the cost of local resolution (S4 Fig), while still providing good agreement to the cryo-EM density.

The unweighted FSC average [27] serves as an established similarity score that is not related to the biasing potentials which were used to refine the system (Fig 5b). All underlying potentials appear to lead to refinement simulations that converge in less than 2 ns, as shown in Fig 5c. The less rugged and long-range potential properties of relative-entropy based density forces are reflected in a rapid rigid-body like initial fit to global structural features, while the other potentials show gradual improvements in fit.

The refinement of a GroEl subunit in two different conformations as determined by cryo-EM [28] stretches the limits of the model assumptions of our refinement potential by refining it against a more noisy model with imperfect map-to-model correspondence. Similar to aldolase refinement, adaptive force-scaling allows for rapid and reliable refinement into the model density, as shown in Fig 6 (S6 and S7 Tables). However, the relative-entropy based potential’s propensity to taking all density into account leads to deviations from the published model in regions with density that has no correspondence in the all-atom model (S5 and S6 Figs).

Fig 6. Refinement of an all-atom GroEL cryo-EM structure (PDB id 5W0S state 1) into experimental cryo-EM density (EMD-8750, additional map 3).

Fig 6

a Final structure models from density-guided simulations using different similarity scores colored by unaligned root mean square coordinate deviation (RMSD) per residue from the deposited model (PDB id 5W0S state 3). b Fourier shell correlation of starting structure (gray line), rigid-body fit of the starting model to the target density (blue) as well as refinement results in the last simulation frame (solid lines) deviations of an equilibrium simulation (dotted lines). c Unweighted FSC average over the course of refinement simulation.

2.8 Model rectification by combining force-field and cryo-EM data

To assess performance in larger structural transitions, we repeated the aldolase refinement when starting from initial model structures that have been distorted by heating with partially unfolded secondary structure elements (Fig 7, as described in Methods). Fig 7b shows the final relative-entropy based model of the refinement procedure that achieved 1.13 Å heavy-atom RMSD from the manually built model. Structural details at map resolution match in secondary structure elements. In contrast to refinement of the undistorted X-ray structure, the relative-entropy based potential gains less from the long-rangedness of the potential and the rapid alignment of large-scale features, because structural rearrangements were needed on all length scales. The adaptive force scaling protocol alleviates differences between density-based potentials in refinement speed and allows for refinement with good structural agreement in less than 3 ns (S7 Fig). The adaptive force scaling protocol allows the modeling to be more steered by cryo-EM data and reach model structures that would not have been accessible by modeling using stereochemical information from the force-field alone.

Fig 7. Cryo-EM data rectifies model distortions with density-guided simulations.

Fig 7

a Distorted starting model RMSD with respect to manually built model (PDB id 6V20). b Final model structure after refinement into a cryo-EM density (EMD-21023) using adaptive force scaling and relative-entropy similarity score. c Close-up of structural features of the final simulation model (green lines) and cryo-EM density (gray mesh). d Fourier shell correlation of starting structure (gray line) as well as refinement results in the last simulation frame (solid lines). The reported cryo-EM map resolution and 0.143 FSC value are indicated with grey lines. e Un-weighted FSC average over the course of a refinement simulation.

3 Discussion

While defining a purely empirical similarity measure can sometimes suffice to fit structures to cryo-EM densities, connecting the similarity measure to the underlying measurement process of the target density enables derivation of natural similarity measures. From very few assumptions, this results in density-based potentials derived from the maximum likelihood that coincides with the relative-entropy between a model density generated from model/simulation atom coordinates and the target cryo-EM density.

The newly defined potentials have favorable features, with the Poisson statistics-based relative entropy potential most prominently exhibiting long range and low ruggedness. It avoids local minima that do not correspond to desired configurations during refinement and allows rapid alignment of large-scale features, and performs superior to established refinement in the zero noise setting with synthetic density maps. The noise content in current cryo-EM densities is likely still too high to be handled with the current minimalistic model assumptions, but as the quality of cryo-EM and other low-to-medium resolution techniques continues to rapidly improve, we believe there will be even more advantages to models that do not depend on smoothing. In addition, the adaptive force-scaling provides a surprisingly simple way to tackle the inaccessibility of the balance between force-field and density-based forces within our model assumptions. It allows for parameter-free refinement that is one to two orders of magnitude faster than currently established protocols. Conceptually it is orthogonal to, and easily combined with, multi-resolution protocols [15].

Another illustration of the usefulness of our framework for handling force-field vs. fitting forces is how it enables us to deduce a close-to-optimal model density spread for refinement, and even more so that this value is not identical to the common practice of setting it equal to the experimental resolution. While many of these factors could still be tuned manually, removing them as free parameters means fewer arbitrary settings that avoid over- or underfitting, which will be even more important when trying to combine e.g. multiple sources of experimental data. For trial structure refinement against recent cryo-EM data, we show that we achieve excellent fits independent of initial model quality.

One limitation of the current formulation is that it does not explicitly take more information from the cryo-EM density reconstruction process into account. A first step to broaden the approach presented here is to account for the local resolution information from the cryo-EM map, which may be seen as a measure of the noisiness of the density data. In practice, the local resolution will still influence the fit since low local resolution will correspond to smoother regions of the map, and lower-magnitude gradients will lead to lower-magnitude fitting forces in those regions. However, a formally more correct way to address the problem is likely to treat the target cryo-EM density as a statistical distribution with a variance that is spatially resolved—this is something we intend to pursue in the future to see whether it can further improve the issue of the relative-entropy potential with more noisy data.

Our current algorithm opens the path to a fully Bayesian approach by generating structures from which the correct probability distribution and the scale parameters s and r for the Dirichlet and Poisson model, respectively, may be estimated and thus estimate the correct balance between force-field and cryo-EM data. This will also work as an alleviation of the short-coming that the adaptive-force scaling is an inherently non-equilibrium approach, which might thought of as an “initial structure generator” for further equilibrium sampling.

The algorithms proposed in this work are freely available, integrated, and maintained as part of GROMACS [29]. Overall, three independent building blocks are provided to aid the modeling of cryo-EM data that each may be individually implemented in current modeling tools: A new refinement potential, a new criterion for how to calculate the model density, both based on reasoning from a maximum likelihood approach, and adaptive force scaling to gently and automatically bias stereochemistry and cryo-em data influence. The implementation also provides tools to monitor the refinement process. Although it can still be difficult for any automated method to compete with manual model building by an experienced structural biologist, we believe these methods provide new ways to extract as much structural information as possible from cryo-EM densities at minimal human and computational cost, which is particularly attractive e.g. for fully automated model building.

4 Methods

4.1 Calculating density-based forces

For ease of implementation and computational efficiency the derivative of Eq (4) is decomposed into a similarity measure derivative and a simulated density model derivative, summed over all density voxels v

Fdensity=kvρvsS(ρ,ρs)·rρvs(r). (12)

Though the convolution in eq (12) might be evaluated with possible performance benefits in Fourier space, we have implemented the more straightforward real-space approach.

The forward model ρs is calculated using fast Gaussian spreading as used in [30]; the integral over the three-dimensional Gaussian function over a voxel is approximated by its function value at the voxel center v at little information loss (Fig B in S1 Appendix). Amplitudes of the Gaussian functions [31] have been approximated with unity for all atoms except hydrogen. The explicit terms that follow for S(ρ, ρs) and rρvs(r) are stated in the S1 Appendix.

4.2 Multiple time-stepping for density-based forces

For computational efficiency, density-based forces are applied only every Nfit steps. The applied force is scaled by Nfit to approximate the same effective Hamiltonian as when applying the forces every step while maintaining time-reversibility and energy conservation [32, 33]. The maximal time step should not exceed the fastest oscillation period of any atom within the map potential divided by π. This oscillation period depends on the choice of reference density, the similarity measure, and the force constant and has thus been estimated heuristically.

4.3 Adaptive force scaling

Adaptive force constant scaling decreases the force-constant when similarity increases by a factor 1 + α, with α > 0, and reversely increases it by a factor 1 + 2α when similarity decreases. The larger increase than decrease factor enforces an increase in similarity over time.

To avoid spurious fast changes in force-constant, similarity decrease and increase are determined by comparing similarity scores of an exponential moving average. The simulation time scale is coupled to the adaptive force scaling protocol by setting α=NfitΔtτ, where Δt is the smallest time increment step of the simulation and τ determines the time-scale of the coupling.

This adaptive force scaling protocol ensures a growing influence of the density data in the course of the simulation, eventually dominating the force-field. Simulations with adaptive force scaling are terminated when overall forces on the system are too large to be compatible with the integration time step.

4.4 Comparing refined structures to manually built models

Root mean square deviations (RMSD) of all heavy atom coordinates (excluding hydrogen atoms) used absolute positions without super-position as structures because the cryo-EM density provides the absolute frame of reference. This is an upper bound to RMSD values between refined and manually built models calculated with rotational and translational alignment.

4.5 Comparing refined structures to cryo-EM densities

Fourier shell correlation curves and un-weighted Fourier shell correlation averages [27] were calculated at 4 ps intervals from structures during the trajectories by generating densities from the model structures using a Gaussian σ of 0.45 Å, corresponding to a resolution as defined in EMAN2 [24] to 2Å.

4.6 Map and model preparation before refinement

4.6.1 Helix

Noise-free helix density maps at 2Å simulated resolution on a 1 Å voxel grid were generated from an atomic model using “molmap” as provided by chimera [34]. Two frames taken from an equilibration simulation of the helix model were used as starting models for subsequent density-guided refinement.

4.6.2 Aldolase

A simulation box of the exact dimensions as aldolase density map EMD-21023 was used. The corresponding aldolase model (PDB id 6V20) was treated as a ground truth for RMSD calculations. A previously determined X-ray model (PDB id 6ALD) was used as starting structure. The system was subjected to energy minimization before fitting. No symmetry constraints were used in simulations.

4.6.3 GroEL

The density-guided simulations of GroEL were performed between conformational states within an individual oligomer [28]. State 1 (PDB id 5W0S(1)) was used as starting model and fit toward conformational state 3 (EMD-8750, additional map 3). A section of the density map corresponding to a single oligomer was used as the target. The target map was created using ChimeraX [35] and by including any volume data within a 5 Å range of the previously determined model corresponding to state 3 (PDB id 5W0S(3)). The target density was centered in a map and the map size was set to 100³ using relion image handler [36].

4.7 Generation of a distorted model

To generate a distorted starting model, the aldolase protein X-ray was heated to 43313 K over a period of 5 ns. During heating, the pressure was controlled with the Berendsen barostat, favoring simulation stability over thermodynamic considerations. To disentangle effects from decreasing the temperature and fitting, the distorted structure was subjected to 5 ns of equilibration at 300 K before starting the density-guided simulations with the same protocol as described above.

4.8 Molecular dynamics simulation

All simulations were carried out with GROMACS version 2021.3 [29] and the CHARMM27 force-field [37, 38] in an NPT and NVT ensembles with neutralized all-atom systems in 150 mM NaCl solution. The temperature was regulated with the velocity-rescaling thermostat at a coupling frequency of 0.2 ps to ensure rapid dissipation of excess energy from density-based potential, when structures are very dissimilar from the cryo-EM density, i.e., far from equilibrium. The pressure was controlled with the Parinello-Rahman barostat for aldolase simulations, helix simulations were carried out at constant volume. Aldolase and GroEL were aligned roughly to the density by placing their center of geometry in the center of the cryo-EM density box. Forces from density-guided simulations were applied every Nfit = 10 steps according to the protocol described above [33]. All simulations to refine a structure against a density were carried out with adaptive force-scaling. We used a coupling constant of τ = 4 ps, balancing time to result with time for structures to relax. For aldolase simulations, the Gaussian spread width was determined by using a lower bound on the highest estimated local resolution of 1.83 Å. Spread width for GroEl simulations was set to 1.0455Å, based on the 1.23Å voxel size in the map used for refinement. Periodic boundary conditions are treated as described in the S1 Appendix. All simulation setup parameters and workflows have been made available.

Supporting information

S1 Appendix. Derivation of refinement potentials.

Derivation of relative entropy measures from Poisson noise assumption, grid scattering assumptions, model density gradients and similarity measure definitions.

(PDF)

S1 Table. Scattering cross sections.

Scattering amplitudes for common atoms in biological molecules at 150keV, derived from Appendix C in Ref. [31].

(PDF)

S2 Table. Aligned helix fitting.

Heavy-atom RMSD [Å] at the final frame compared to conformation from which density was generated.

(PDF)

S3 Table. Unaligned helix fitting.

Heavy-atom RMSD [Å] at the final frame compared to conformation from which density was generated.

(PDF)

S4 Table. Aldolase fitting.

Heavy-atom RMSD [Å] from final simulation frames, as compared to PDB id 6V20.

(PDF)

S5 Table. Aldolase model statistics.

Properties calculated with phenix-1.18.2–3874, using molprobity, CABLAM, and EMRinger methods. All data reflect the final frame without any further geometry optimization.

(PDF)

S6 Table. GroEl fitting.

Heavy-atom RMSD [Å] from final simulation frames, as compared to conformational state 3 of PDB id 5W0S.

(PDF)

S7 Table. GroEl model statistics.

Properties calculated with phenix-1.18.2–3874, using molprobity, CABLAM, and EMRinger methods. All data reflect the final frame without any further geometry optimization.

(PDF)

S1 Fig. Adaptive force constant change.

Evolution of force constant (arbitrary units) during aligned helix refinement for inner-product(purple), cross-correlation(ochre), relative-entropy(green) and relative-entropy-swapped (blue).

(PDF)

S2 Fig. Final results of aligned helix fitting.

Simulations using adaptive force scaling from aligned starting conformation using inner-product, cross-correlation, relative-entropy swapped, and relative entropy (ordered top to bottom).

(PDF)

S3 Fig. Final results of unaligned helix fitting.

Simulations using adaptive force scaling from aligned starting conformation using inner-product, cross-correlation, relative-entropy swapped, and relative entropy (ordered top to bottom).

(PDF)

S4 Fig. Aldolase refinement.

Difference in FSC to deposited Aldolase model for inner-product (purple), cross-correlation (ochre), swapped relative-entropy (blue) and relative-entropy (green) based on refinement final frames (solid).

(PDF)

S5 Fig. GroEL subunit refinement.

Difference in FSC to deposited GroEL model for inner-product (purple), cross-correlation (ochre), swapped relative-entropy (blue) and relative-entropy (green) based on refinement final frames (solid).

(PDF)

S6 Fig. Relative entropy can be sensitive to additional density.

Relative entropy-based refinement (blue-red, according to RMSD to published model) can struggles in regions with extra density present in the map, whereas the cross-correlation based potential (ochre) adheres to the local minimum defined by the density.

(TIF)

S7 Fig. Aldolase refinement from distorted structure.

Difference in FSC to manually built model for inner-product (purple), cross-correlation (ochre), relative-entropy swapped (dark blue) and relative-entropy (green) based refinements with best accepted FSC (dotted) and final FSC (solid).

(PDF)

Acknowledgments

We would like to thank Rebecca J. Howard and Marta Carroni for insightful discussions of the manuscript.

Data Availability

Simulation starting structures, generated densities, setup parameters, complete work- flow setups via Makefiles and Python scripts to generate Fig 2 as well as data for Fig 3 and per-residue RMSD are available via Zenodo (https://doi.org/10.5281/zenodo.4556616). The code to perform density-guided molecular dynamics simulations is maintained within GROMACS and publicly available in release 2021 and later, as well as in the repository at https://gitlab.com/gromacs/gromacs. Fourier shell correlation analysis of tra- jectories has been implemented on top of the GROMACS codebase following conventions in EMAN2 [25] and is available at https://gitlab.com/gromacs/gromacs/-/commits/fscavg. Python scripts to generate Fig 2, as well as data for Fig 3 and per-residue RMSD are available via Zenodo (https://doi.org/10.5281/zenodo.4556616).

Funding Statement

Swedish Research Council (EL; 2017-04641, 2018-06479, 2019-02433), The BioExcel Center of Excellence (EL; EU 823830), Knut and Alice Wallenberg Foundation (EL; 1484505), Carl Trygger Foundation (EL; CTS-15:298) and the Swedish e-Science Research Centre. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Cheng Y. Single-particle cryo-EM—How did it get here and where will it go. Science. 2018;361(6405):876–880. doi: 10.1126/science.aat4346 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Yip KM, Fischer N, Paknia E, Chari A, Stark H. Atomic-resolution protein structure determination by cryo-EM. Nature. 2020;587(7832):157–161. doi: 10.1038/s41586-020-2833-4 [DOI] [PubMed] [Google Scholar]
  • 3. Nakane T, Kotecha A, Sente A, McMullan G, Masiulis S, Brown PM, et al. Single-particle cryo-EM at atomic resolution. Nature. 2020;587(7832):152–156. doi: 10.1038/s41586-020-2829-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Scheres SH. RELION: implementation of a Bayesian approach to cryo-EM structure determination. Journal of structural biology. 2012;180(3):519–530. doi: 10.1016/j.jsb.2012.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Alnabati E, Kihara D. Advances in structure modeling methods for cryo-electron microscopy maps. Molecules. 2020;25(1):82. doi: 10.3390/molecules25010082 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Bock LV, Blau C, Schröder GF, Davydov II, Fischer N, Stark H, et al. Energy barriers and driving forces in tRNA translocation through the ribosome. Nature structural & molecular biology. 2013;20(12):1390–1396. doi: 10.1038/nsmb.2690 [DOI] [PubMed] [Google Scholar]
  • 7. Eswar N, Webb B, Marti-Renom MA, Madhusudhan M, Eramian D, Shen My, et al. Comparative protein structure modeling using Modeller. Current protocols in bioinformatics. 2006;15(1):5–6. doi: 10.1002/0471250953.bi0506s15 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Topf M, Lasker K, Webb B, Wolfson H, Chiu W, Sali A. Protein structure fitting and refinement guided by cryo-EM density. Structure. 2008;16(2):295–307. doi: 10.1016/j.str.2007.11.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Kim DN, Moriarty NW, Kirmizialtin S, Afonine PV, Poon B, Sobolev OV, et al. Cryo_fit: Democratization of flexible fitting for cryo-EM. Journal of structural biology. 2019;208(1):1–6. doi: 10.1016/j.jsb.2019.05.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Liebschner D, Afonine PV, Baker ML, Bunkóczi G, Chen VB, Croll TI, et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallographica Section D: Structural Biology. 2019;75(10):861–877. doi: 10.1107/S2059798319011471 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Igaev M, Kutzner C, Bock LV, Vaiana AC, Grubmüller H. Automated cryo-EM structure refinement using correlation-driven molecular dynamics. Elife. 2019;8:e43542. doi: 10.7554/eLife.43542 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Trabuco LG, Villa E, Mitra K, Frank J, Schulten K. Flexible fitting of atomic structures into electron microscopy maps using molecular dynamics. Structure. 2008;16(5):673–683. doi: 10.1016/j.str.2008.03.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Ahmed A, Whitford PC, Sanbonmatsu KY, Tama F. Consensus among flexible fitting approaches improves the interpretation of cryo-EM data. Journal of structural biology. 2012;177(2):561–570. doi: 10.1016/j.jsb.2011.10.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Wang Z, Schröder GF. Real-space refinement with DireX: From global fitting to side-chain improvements. Biopolymers. 2012;97(9):687–697. doi: 10.1002/bip.22046 [DOI] [PubMed] [Google Scholar]
  • 15. Singharoy A, Teo I, McGreevy R, Stone JE, Zhao J, Schulten K. Molecular dynamics-based refinement and validation for sub-5 Å cryo-electron microscopy maps. Elife. 2016;5:e16105. doi: 10.7554/eLife.16105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Miyashita O, Kobayashi C, Mori T, Sugita Y, Tama F. Flexible fitting to cryo-EM density map using ensemble molecular dynamics simulations. Journal of computational chemistry. 2017;38(16):1447–1461. doi: 10.1002/jcc.24785 [DOI] [PubMed] [Google Scholar]
  • 17. Habeck M. Bayesian modeling of biomolecular assemblies with cryo-EM maps. Frontiers in molecular biosciences. 2017;4:15. doi: 10.3389/fmolb.2017.00015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Bonomi M, Hanot S, Greenberg CH, Sali A, Nilges M, Vendruscolo M, et al. Bayesian weighing of electron cryo-microscopy data for integrative structural modeling. Structure. 2019;27(1):175–188. doi: 10.1016/j.str.2018.09.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Cossio P, Rohr D, Baruffa F, Rampp M, Lindenstruth V, Hummer G. BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images. Computer Physics Communications. 2017;210:163–171. doi: 10.1016/j.cpc.2016.09.014 [DOI] [Google Scholar]
  • 20. Páll S, Zhmurov A, Bauer P, Abraham M, Lundborg M, Gray A, et al. Heterogeneous parallelization and acceleration of molecular dynamics simulations in GROMACS. J Chem Phys. 2020;153:134110. doi: 10.1063/5.0018516 [DOI] [PubMed] [Google Scholar]
  • 21. Jaynes ET. Information theory and statistical mechanics. Physical review. 1957;106(4):620. doi: 10.1103/PhysRev.106.620 [DOI] [Google Scholar]
  • 22. Cheng Y, Grigorieff N, Penczek PA, Walz T. A primer to single-particle cryo-electron microscopy. Cell. 2015;161(3):438–449. doi: 10.1016/j.cell.2015.03.050 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Hummer G, Köfinger J. Bayesian ensemble refinement by replica simulations and reweighting. The Journal of chemical physics. 2015;143(24):12B634_1. doi: 10.1063/1.4937786 [DOI] [PubMed] [Google Scholar]
  • 24. Tang G, Peng L, Baldwin PR, Mann DS, Jiang W, Rees I, et al. EMAN2: an extensible image processing suite for electron microscopy. Journal of structural biology. 2007;157(1):38–46. doi: 10.1016/j.jsb.2006.05.009 [DOI] [PubMed] [Google Scholar]
  • 25. Choi KH, Mazurkie AS, Morris AJ, Utheza D, Tolan DR, Allen KN. Structure of a fructose-1, 6-bis (phosphate) aldolase liganded to its natural substrate in a cleavage-defective mutant at 2.3 Å. Biochemistry. 1999;38(39):12655–12664. doi: 10.1021/bi9828371 [DOI] [PubMed] [Google Scholar]
  • 26. Wu M, Lander GC, Herzik MA Jr. Sub-2 Angstrom resolution structure determination using single-particle cryo-EM at 200 keV. Journal of Structural Biology: X. 2020;4:100020. doi: 10.1016/j.yjsbx.2020.100020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Brown A, Long F, Nicholls RA, Toots J, Emsley P, Murshudov G. Tools for macromolecular model building and refinement into electron cryo-microscopy reconstructions. Acta Crystallographica Section D: Biological Crystallography. 2015;71(1):136–153. doi: 10.1107/S1399004714021683 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Roh SH, Hryc CF, Jeong HH, Fei X, Jakana J, Lorimer GH, et al. Subunit conformational variation within individual GroEL oligomers resolved by Cryo-EM. Proceedings of the National Academy of Sciences. 2017;114(31):8259–8264. doi: 10.1073/pnas.1704725114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Páll S, Zhmurov A, Bauer P, Abraham M, Lundborg M, Gray A, et al. Heterogeneous parallelization and acceleration of molecular dynamics simulations in GROMACS. The Journal of Chemical Physics. 2020;153(13):134110. doi: 10.1063/5.0018516 [DOI] [PubMed] [Google Scholar]
  • 30. Greengard L, Lee JY. Accelerating the nonuniform fast Fourier transform. SIAM review. 2004;46(3):443–454. doi: 10.1137/S003614450343200X [DOI] [Google Scholar]
  • 31. Kirkland EJ. Advanced computing in electron microscopy. Springer; 1998. [Google Scholar]
  • 32. Leimkuhler BJ, Reich S, Skeel RD. Integration methods for molecular dynamics. In: Mathematical Approaches to biomolecular structure and dynamics. Springer; 1996. p. 161–185. [Google Scholar]
  • 33. Ferrarotti MJ, Bottaro S, Pérez-Villa A, Bussi G. Accurate multiple time step in biased molecular simulations. Journal of chemical theory and computation. 2015;11(1):139–146. doi: 10.1021/ct5007086 [DOI] [PubMed] [Google Scholar]
  • 34. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF Chimera—a visualization system for exploratory research and analysis. Journal of computational chemistry. 2004;25(13):1605–1612. doi: 10.1002/jcc.20084 [DOI] [PubMed] [Google Scholar]
  • 35. Pettersen EF, Goddard TD, Huang CC, Meng EC, Couch GS, Croll TI, et al. UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Science. 2021;30(1):70–82. doi: 10.1002/pro.3943 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Zivanov J, Nakane T, Forsberg BO, Kimanius D, Hagen WJ, Lindahl E, et al. New tools for automated high-resolution cryo-EM structure determination in RELION-3. elife. 2018;7:e42166. doi: 10.7554/eLife.42166 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Mackerell AD Jr, Feig M, Brooks CL III. Extending the treatment of backbone energetics in protein force fields: Limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. Journal of computational chemistry. 2004;25(11):1400–1415. doi: 10.1002/jcc.20065 [DOI] [PubMed] [Google Scholar]
  • 38. Bjelkmar P, Larsson P, Cuendet MA, Hess B, Lindahl E. Implementation of the CHARMM force field in GROMACS: analysis of protein stability effects from correction maps, virtual interaction sites, and water models. Journal of chemical theory and computation. 2010;6(2):459–466. doi: 10.1021/ct900549r [DOI] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011255.r001

Decision Letter 0

Nir Ben-Tal, Jochen Hub

1 Nov 2022

Dear %TITLE% Lindahl,

Thank you very much for submitting your manuscript "Gentle and fast all-atom model refinement to cryo-EM densities via Bayes' approach" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

Reviewers 1 and 2 are overall positive about the proposed refinement method. They request additional tests and a few clarifications to be able of evaluating the method. Reviewer 3 is more critical and raises conceptual concerns, in particular on the use of the heuristic force constant (as also mentioned by Reviewer 2) and on the assumption of a Poisson likelihood. Reviewer 3 also questions whether using a single experimental test case is sufficient for stating that the method outperforms alternatives.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Jochen Hub

Academic Editor

PLOS Computational Biology

Nir Ben-Tal

Section Editor

PLOS Computational Biology

***********************

Reviewers 1 and 2 are overall positive about the proposed refinement method. They request additional tests and a few clarifications to be able of evaluating the method. Reviewer 3 is more critical and raises conceptual concerns, in particular on the use of the heuristic force constant (as also mentioned by Reviewer 2) and on the assumption of a Poisson likelihood. Reviewer 3 also questions whether using a single experimental test case is sufficient for stating that the method outperforms alternatives.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: In this work, the authors proposed to use relative entropy derived from Bayes’ approach in all-atom model refinement to cryo-EM density data. In the model refinement, molecular force field and similarity bias force are both used. The conventional similarity bias forces are inner product, cross correlation, and so on. The proposed relative entropy seems to have advantages over these conventional similarity bias forces because the effect of similarity bias is longer range even though the overlap between structural models and cryo-EM densities is very small. Adaptive force scaling looks also useful for adjusting the good balance between molecular force field and similarity bias forces. In general, this work can provide new insight and useful information in the all-atom model refinement to cryo-EM density data. We would like to suggest several points to improve the manuscript and to clarify the advantages of relative entropy in the refinement.

(1) For noise-free data, is it possible to test different initial structures ? For instance, not only completely unaligned structures but also (one or two residue) miss-aligned ones could be tested.

(2) To compare the performance of inner-product, cross-correlation with relative-entropy, not only FSC but also EMRinger, MolProbity, CABLAM should be shown.

(3) In Figure 6, they used distorted models for showing the performance of relative-entropy approach. We suggest them to show more practical examples for showing the performance. For instance, there are several proteins whose structures were determined with X-ray crystallography and another structures were solved with cryo-EM at the lower resolution. We would like to see how the relative entropy shows good performance to refine all-atom models with lower resolution cryo-EM densities using the X-ray structure at the different physiological states. As we suggested, not only FSC but also EMRinger, MolProbity, and CABLAM could be used to investigate the performance.

Reviewer #2: The manuscript by Blau et al. describes an algorithm to refine protein structures against cryo-EM density maps that is based on molecular dynamics simulations. In their description, the assumption is made, that the EM density map describes a single conformation and the goal of the refinement is a single, best-fitting model.

Applications to simulated and experimental density maps are shown. Overall, this is a very interesting manuscript, which describes a new development of the MD-based structure refinement in Gromacs.

The method seems to be well applicable and fast, it is therefore a valuable contribution for the (computational) structural biology community and well suited for PLOS Computational Biology.

Here are a few points that need to be clarified:

- A new refinement potential is derived, which results in the Kullback-Leibler divergence between model density and EM density. The authors emphasize that this refinement potential is smooth. I would think that ideally, all experimental information should be used during the refinement. I understand, that high-resolution density maps are rugged and lead to slower convergence, since many barriers need to be crossed, but this is often accounted by a stepwise refinement, starting with low-pass filtered maps (as e.g. described in cascade MDFF), and then at later stages of the refinement, the full-resolution maps are used. It does not sound like a desirable feature to use only a smooth refinement potential throughout and thereby possibly ignoring valuable high-resolution information. The authors might want to clarify, whether indeed high-resolution is potentially ignored with their approach.

- Where does the 3sigma potential energy threshold come from? This is not really described in the text.

- In the applications to experimental data, the model was fit against half map1? Why was that done? The half maps are usually the raw output of the density refinement program (e.g. Relion), the maps are typically not filtered and not sharpened and should therefore not be used for structure refinement.

- The FSC threshold for the resolution definition is 0.143 when comparing the two half maps (after a gold-standard refinement). The FSC threshold for comparing a model map with the full EM map (sum of both half maps) (also referred to as the model-map cross-resolution) is 0.5.

That means, if the half map FSC is 0.143 at a resolution value XÅ, the model-map FSC should have a value of 0.5 at the same resolution XÅ. However, in figure 5 and 6 this is not the case.

Here, the model is compared to only a half map, which is unusual and makes this comparison more difficult.

This needs to be clarified.

- page 10: "The force constant for density-guided simulations cannot be derived from the cryo-EM density alone and thus needs to be set heuristically."

In principle, the force constant could in fact be determined from the cryo-EM data, it requires to model the error of the density map. The main motivation to use Bayes (in my opinion) is that it does not require any tunable parameters if all errors are modeled. However, in this work, there are still several tunable parameters: the force-constant, the filter of the density maps, and the 3sigma potential cut-off selects the best structure from the trajectory. The error of the density map could in principle be modeled based on the FSC. When I read "Bayes" in the title, I immediately expected that the force constant would be obtained by modeling the experimental errors. It might be helpful to the reader to clarify, why this was not done.

- The relative-entropy based density potential enforces a global agreement of model and density.

How does that affect the typical practical problem of missing model (e.g. missing long loops, missing domains)? Would you expect larger distortions than with the traditional refinement target functions?

- Throughout the manuscript: Cryo-EM does not yield an "electron density" map, but an electrostatic potential map. Please do not use "electron density". (X-ray diffraction yields electron density)

Appendix S1, "II Gaussian Spreading on a Grid":

"we assume Gaussian spatial noise", I do not understand why this has anything to do with noise?

This is just a convolution with a Gaussian, or not?

And how is the factor A chosen, this should depend on atom type. There is a reference to a textbook by Kirkland, but I think it would be helpful to just list values for the few most relevant atom types in proteins.

typos:

page 7: "vitr*e*ous ice"

page 10: "This issue can *be* alleviated in..."

page 14: "excees" -> "exceeds"

Appendix page 4: "Wi*gn*er-Seitz"

Reviewer #3: see attached review

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: review.pdf

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011255.r003

Decision Letter 1

Nir Ben-Tal, Jochen Hub

29 Mar 2023

Dear %TITLE% Lindahl,

Thank you very much for submitting your manuscript "Gentle and fast all-atom model refinement to cryo-EM densities via Bayes' approach" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

Reviewers 1 and 2 confirmed that their comments have been largely addressed. Reviewer 3, however, points out that several mathematical equations as well as new text added during the revision are not justified or not sufficiently explained. Their criticism on the fact that the method is not "Bayesian" remains (because it is never sampled from the posterior).

From Reviewer 3's report and from my own reading, there is no need for new simulations or analysis prior to acceptance, but the criticism can be solved by revising the discussion and the mathematical derivations and motivations. Most critically, a discussion on how to render the method "Bayesian" in a future study is needed, or to tone down the claim that the method would be Bayesian. The criterium used to stop the simulation must be stated (since Fig. S3 does not reveal any convergence). Apart from following the comments by Reviewer 3, please also revise Eq. 9., as the partial derivate of S with respect to S should be one. Maybe you need to use a different integration variable here?

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Jochen Hub

Academic Editor

PLOS Computational Biology

Nir Ben-Tal

Section Editor

PLOS Computational Biology

***********************

Reviewers 1 and 2 confirmed that their comments have been largely addressed. Reviewer 3, however, points out that several mathematical equations as well as new text added during the revision are not justified or not sufficiently explained. His criticism on the fact that the method is not "Bayesian" remains (because it is never sampled from the posterior).

From Reviewer 3's report and from my own reading, there is no need for new simulations or analysis prior to acceptance, but the criticism can be solved by revising the discussion and the mathematical derivations and motivations. Most critically, a discussion on how to render the method "Bayesian" in a future study is needed, or to tone down the claim that the method would be Bayesian. The criterium used to stop the simulation must be stated (since Fig. S3 does not reveal any convergence). Apart from following the comments by Reviewer 3, please also revise Eq. 9., as the partial derivate of S with respect to S should be one. Maybe you need to use a different integration variable here?

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors replied to almost all the questions and comments raised by three reviewers. We are satisfied about their replies to our previous comments. In particular, Figure 6, a new figure in the revised manuscript, seems to be meaningful to understand the behavior of relative entropy in the refinement.

However, before the publication of this manuscript, we suggest to correct the following points:

(1) add the label "C" in Figure 6. I can find only the labels "A" and "B" in Figure 6.

(2) Probably, the vertical axis "FSC_AVG@2.3A" seems to be wrong. 1/2.3 = 0.434. Figure 6b suggests that FSC value of this resolution (0.434) is almost zero. Please use the correct resolution in the vertical axis.

Reviewer #2: My concerns have been fully addressed in the revised version of the manuscript.

Reviewer #3: see attached PDF

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Attachment

Submitted filename: review2.pdf

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011255.r005

Decision Letter 2

Nir Ben-Tal, Jochen Hub

9 Jun 2023

Dear %TITLE% Lindahl,

Thank you for further improving the manuscript. The power of the method together with remaining room for future developments are now clearly stated. Congratulations to this insightful work!

We are pleased to inform you that your manuscript 'Gentle and fast all-atom model refinement to cryo-EM densities via a maximum likelihood approach' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Jochen Hub

Academic Editor

PLOS Computational Biology

Nir Ben-Tal

Section Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1011255.r006

Acceptance letter

Nir Ben-Tal, Jochen Hub

23 Jul 2023

PCOMPBIOL-D-22-01441R2

Gentle and fast all-atom model refinement to cryo-EM densities via a maximum likelihood approach

Dear Dr Lindahl,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Bernadett Koltai

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Appendix. Derivation of refinement potentials.

    Derivation of relative entropy measures from Poisson noise assumption, grid scattering assumptions, model density gradients and similarity measure definitions.

    (PDF)

    S1 Table. Scattering cross sections.

    Scattering amplitudes for common atoms in biological molecules at 150keV, derived from Appendix C in Ref. [31].

    (PDF)

    S2 Table. Aligned helix fitting.

    Heavy-atom RMSD [Å] at the final frame compared to conformation from which density was generated.

    (PDF)

    S3 Table. Unaligned helix fitting.

    Heavy-atom RMSD [Å] at the final frame compared to conformation from which density was generated.

    (PDF)

    S4 Table. Aldolase fitting.

    Heavy-atom RMSD [Å] from final simulation frames, as compared to PDB id 6V20.

    (PDF)

    S5 Table. Aldolase model statistics.

    Properties calculated with phenix-1.18.2–3874, using molprobity, CABLAM, and EMRinger methods. All data reflect the final frame without any further geometry optimization.

    (PDF)

    S6 Table. GroEl fitting.

    Heavy-atom RMSD [Å] from final simulation frames, as compared to conformational state 3 of PDB id 5W0S.

    (PDF)

    S7 Table. GroEl model statistics.

    Properties calculated with phenix-1.18.2–3874, using molprobity, CABLAM, and EMRinger methods. All data reflect the final frame without any further geometry optimization.

    (PDF)

    S1 Fig. Adaptive force constant change.

    Evolution of force constant (arbitrary units) during aligned helix refinement for inner-product(purple), cross-correlation(ochre), relative-entropy(green) and relative-entropy-swapped (blue).

    (PDF)

    S2 Fig. Final results of aligned helix fitting.

    Simulations using adaptive force scaling from aligned starting conformation using inner-product, cross-correlation, relative-entropy swapped, and relative entropy (ordered top to bottom).

    (PDF)

    S3 Fig. Final results of unaligned helix fitting.

    Simulations using adaptive force scaling from aligned starting conformation using inner-product, cross-correlation, relative-entropy swapped, and relative entropy (ordered top to bottom).

    (PDF)

    S4 Fig. Aldolase refinement.

    Difference in FSC to deposited Aldolase model for inner-product (purple), cross-correlation (ochre), swapped relative-entropy (blue) and relative-entropy (green) based on refinement final frames (solid).

    (PDF)

    S5 Fig. GroEL subunit refinement.

    Difference in FSC to deposited GroEL model for inner-product (purple), cross-correlation (ochre), swapped relative-entropy (blue) and relative-entropy (green) based on refinement final frames (solid).

    (PDF)

    S6 Fig. Relative entropy can be sensitive to additional density.

    Relative entropy-based refinement (blue-red, according to RMSD to published model) can struggles in regions with extra density present in the map, whereas the cross-correlation based potential (ochre) adheres to the local minimum defined by the density.

    (TIF)

    S7 Fig. Aldolase refinement from distorted structure.

    Difference in FSC to manually built model for inner-product (purple), cross-correlation (ochre), relative-entropy swapped (dark blue) and relative-entropy (green) based refinements with best accepted FSC (dotted) and final FSC (solid).

    (PDF)

    Attachment

    Submitted filename: review.pdf

    Attachment

    Submitted filename: blau_plos_response.pdf

    Attachment

    Submitted filename: review2.pdf

    Attachment

    Submitted filename: response_to_reviewers.pdf

    Data Availability Statement

    Simulation starting structures, generated densities, setup parameters, complete work- flow setups via Makefiles and Python scripts to generate Fig 2 as well as data for Fig 3 and per-residue RMSD are available via Zenodo (https://doi.org/10.5281/zenodo.4556616). The code to perform density-guided molecular dynamics simulations is maintained within GROMACS and publicly available in release 2021 and later, as well as in the repository at https://gitlab.com/gromacs/gromacs. Fourier shell correlation analysis of tra- jectories has been implemented on top of the GROMACS codebase following conventions in EMAN2 [25] and is available at https://gitlab.com/gromacs/gromacs/-/commits/fscavg. Python scripts to generate Fig 2, as well as data for Fig 3 and per-residue RMSD are available via Zenodo (https://doi.org/10.5281/zenodo.4556616).


    Articles from PLOS Computational Biology are provided here courtesy of PLOS

    RESOURCES