Abstract
Macromolecular structure determination by electron cryo-microscopy (cryo-EM) is limited by the alignment of noisy images of individual particles. Because smaller particles have weaker signals, alignment errors impose size limitations on its applicability. Here, we explore how image alignment is improved by the application of deep learning to exploit prior knowledge about biological macromolecular structures that would otherwise be difficult to express mathematically. We train a denoising convolutional neural network on pairs of half-set reconstructions from the electron microscopy data bank (EMDB) and use this denoiser as an alternative to a commonly used smoothness prior. We demonstrate that this approach, which we call Blush regularization, yields better reconstructions than do existing algorithms, in particular for data with low signal-to-noise ratios. The reconstruction of a protein–nucleic acid complex with a molecular weight of 40 kDa, which was previously intractable, illustrates that denoising neural networks will expand the applicability of cryo-EM structure determination for a wide range of biological macromolecules.
Subject terms: Electron microscopy, Software, Single-molecule biophysics
Blush regularization makes use of a neural network pre-trained on a diverse set of high-resolution cryo-EM half-maps to improve image alignment, effectively lowering the size barrier, during cryo-EM structure determination.
Main
Despite rapid progress in cryo-EM technology in the past decade1, many biological macromolecules of interest are still too small to allow reliable structure determination. To limit the damage that electrons cause to biological structures of interest, cryo-EM images are taken using low doses of electron radiation, leading to high levels of experimental noise. The noise in the images impedes their alignment, resulting in an ill-posed optimization problem in which many reconstructions (which might be noisy or artifactual) are equally probable, given the data. The ill-posedness of the reconstruction imposes a minimum size barrier for cryo-EM structure determination, because smaller complexes yield images with lower signal-to-noise ratios. Although this barrier has been overcome in experiments involving the formation of complexes between small targets and other proteins2, the formation of sufficiently rigid complexes is often difficult. Here we explore a computational method that lowers the size barrier for existing cryo-EM datasets.
Even for ill-posed reconstruction problems, the correct solution can still be identified through the incorporation of prior knowledge. Most cryo-EM structures are calculated using explicit regularization of a likelihood function in Fourier space, which assumes cryo-EM reconstructions are smooth in real space3–5. Although we know much more about the structures of biological macromolecules beyond just the fact that their density varies smoothly, it has been difficult to incorporate richer sources of prior knowledge into the optimization process. Denoising convolutional neural networks can incorporate complex prior knowledge into an iterative optimization process6. By training a denoising network on simulated pairs of noisy and ground-truth images, we have previously provided proof of principle that prior knowledge about protein structures can be exploited to improve cryo-EM structure determination7. However, we also observed problems with overfitting and the hallucination of protein-like features in the resulting reconstructions. Moreover, because experimental cryo-EM structures often comprise regions of well-ordered proteins and nucleic acid domains alongside less structured regions, including, for example, membrane patches or flexible domains, it was not clear how ground-truth pairs for experimental cryo-EM data could be generated.
Here, we demonstrate how a pre-trained denoising convolutional neural network, trained and deployed in an application-specific manner inspired by the noise2noise approach8 (Fig. 1 and Methods), can improve cryo-EM structure determination using experimental data. Through this approach, which we call Blush regularization, we improve reconstructions across a variety of existing cryo-EM datasets, including one for a protein–nucleic acid complex that was too small for analysis using existing methods.
Results
Blush regularization improves reconstruction without overfitting
We first tested Blush regularization on a cryo-EM dataset (EMPIAR-10330)9 for the Plasmodium falciparum chloroquine resistance transporter (PfCRT)10. This dataset has been used as a standard to demonstrate the performance of several approaches in reducing overfitting during cryo-EM refinement11,12. Standard refinement using regularized likelihood optimization in RELION, which we refer to as the baseline, yields an overall resolution of 3.8 Å for this data set.
Application of Blush regularization (Fig. 2) yielded an overall resolution estimate of 3.4 Å. In the last iteration, spectral trailing, a heuristic method that prevents overfitting by limiting the spatial frequency at which information from the denoiser is used (Methods), was applied with a cut-off at 3.5 Å. Compared with the baseline reconstruction, local resolution improved for most regions of the map, with a corresponding increase in visible side-chain densities. The improvement in resolution, as measured by half-map Fourier shell correlation (FSC), was confirmed by FSCs between both maps and the atomic model that was deposited for this dataset (Protein Data Bank (PDB): 6UKJ). Throughout this paper, FSCs between the map and atomic model were calculated using Servalcat13. We also assessed the relative quality of both maps by application of our automated model-building software ModelAngelo14, which generated a model with 84% completeness in the baseline map and 97% completeness in the Blush map. Model completeness is defined as the percentage of residues that match the reference model with a Cα distance of 3 Å or less.
To assess the potential for overfitting by the denoiser, we also performed a phase-randomization test15. We applied Blush regularization without spectral trailing for refinement of the PfCRT dataset with phase randomization beyond 4-Å resolution. Although spectral trailing was not used, no overfitting was observed. Switching off spectral trailing led to a marginal improvement in the quality of reconstruction, as quantified by the FSC between the map and the atomic model (Fig. 2d). These results indicate that the denoiser can prevent overfitting for this dataset, even without spectral trailing. In general, we still recommend running Blush regularization with spectral trailing, because the benefits of switching it off are small and overfitting could be more prominent for other datasets. Consequently, in the following sections, we present results obtained only using spectral trailing.
Blush expands the applicability of cryo-EM reconstruction
We subsequently assessed the broader applicability of Blush regularization by applying it to four types of structures and refinement methods.
First, we tested Blush regularization on a small membrane protein, Ste2, which is a dimeric G-protein-coupled receptor (GPCR)16 (Fig. 3 and Extended Data Table 1). Full-length monomeric Ste2 has a molecular weight of 47.85 kDa, which includes a long disordered carboxy-terminal tail that comprises 125 amino acids. The total mass of the ordered dimeric Ste2 that contributes to alignment is roughly 67 kDa, most of which lies embedded in a detergent micelle.
Extended Data Table 1.
The dataset used was acquired from a similar complex to that in PDB entry 7QB9, reported in ref. 16, but with different biochemical conditions affecting the stability of the structure. Alignment of images of Ste2 is difficult because few protein features extend from the smooth detergent micelle. Baseline reconstruction yielded a map with an overall resolution of 3.8 Å, with limited densities for side chains. Application of Blush regularization led to a structure with an overall resolution of 3.4 Å. Spectral trailing ensured that no information from the denoiser was inserted beyond 3.7-Å resolution. Compared with the baseline reconstruction, the density of the transmembrane helices is improved. Loops at the top and bottom of the structure are still relatively poorly resolved, probably owing to molecular flexibility. In agreement with the visibility of improved side-chain densities and local resolution estimates, the completeness of models built by ModelAngelo in these maps improved from 19% to 43%.
Second, we evaluated the performance of Blush regularization in multi-body refinement17, in which partial signal subtraction is used to align independently moving domains within a larger complex. Reconstructions from subtracted images were included in the training set for the denoiser. Moreover, signal subtraction reduces the amount of signal in each image, placing stringent limitations on the minimal size of domains that can be aligned. We applied Blush regularization in multi-body refinement of a publicly available dataset (EMPIAR-10180) for the Saccharomyces cerevisiae pre-catalytic spliceosomal B complex18 (Fig. 4). Using four bodies, one each for the core, the foot, the helicase and the SF3b regions, Blush regularization improved the quality of reconstructions of all domains compared with baseline multi-body refinement, as measured by local resolution, half-map FSCs and FSCs with the reference atomic model (PDB: 5NRL). The improvements in resolution were largest in the helicase and SF3b regions, which are the most flexible and thus the hardest to reconstruct. The improvements in resolution were reflected by automated model building in ModelAngelo, which increased model completeness of the entire complex from 32% to 48%. In particular, the model completeness for the SF3b region was improved from 3% to 29%.
Third, we assessed the performance of Blush regularization for a biological assembly that was different than the types of structures that the denoiser was trained on: the first intermediate amyloid (FIA) that forms during the in vitro assembly of recombinant tau (residues 297–391)19. This dataset is also publicly available (EMPIAR-11720). Unlike any of the structures in the training set, the FIA has helical symmetry (Fig. 5). It is an amyloid filament, with parallel β-strands repeating every 4.7 Å in the direction of the helical axis. Besides deviating from the types of structures in the training set, the FIA is also one of the smallest amyloid structures solved to date, with only 15 ordered residues in each of two opposing β-sheets. Baseline helical refinement yielded a 5.0-Å-resolution map, in which the density for β-strands along the helical axis was not separated, and no atomic model could be built. Blush regularization improved the resolution to 2.8 Å, and ModelAngelo built all 15 ordered residues in the resulting map.
Fourth, we applied Blush to the small anti-CRISPR associated protein 2 (Aca2) bound to RNA, which has a total molecular weight of 40 kDa (Fig. 6 and Extended Data Table 1). Using different classification and refinement strategies in baseline RELION and CryoSPARC, we could not obtain a reliable reconstruction. Although an initial model generated using the standard VDAM algorithm in RELION20 suffered from anisotropy, the first three-dimensional (3D) classification using Blush regularization resulted in one class with recognizable protein features. Similar 3D classifications without Blush regularization did not yield recognizable protein features. Refinement of the corresponding class yielded a better initial model for a second 3D classification, from which a single class was selected for subsequent CTF refinement21 and particle polishing21. A 3D classification was performed without alignment, followed by a final 3D refinement. Blush regularization was used for all 3D classifications with alignment and 3D refinements. The resolution of the final map was 2.5 Å, with ModelAngelo successfully building 97% of the protein sequence and 33 out of 42 nucleotides.
Discussion
In a previous approach using noise2noise, implemented in the M software22, a new neural network is trained for each dataset that it is applied to, using only half-maps from the same dataset. As such, the neural network in the M software can learn only features that are specific to the dataset at hand. By contrast, we pre-train a single neural network on a diverse set of high-resolution half-maps from the EMDB. Our pre-trained network improves cryo-EM reconstructions for a wide variety of macromolecular complexes, suggesting that it has learned useful features about cryo-EM structures in general. In addition, although our approach was inspired by noise2noise, it blends the unsupervised elements from noise2noise training with new application-specific elements, such as recycling and supervised masks in Fourier space and in real space. An interesting avenue for future research could be a combination of the two approaches, in which the pre-trained Blush network is fine tuned using the half-maps of the dataset at hand, using techniques similar to those implemented in M.
We previously attempted to incorporate prior knowledge about protein structures by training a denoiser on pairs of noisy and ground-truth maps that were calculated from atomic models, and observed problems with overfitting and hallucinations7. Similar problems could explain why the application of the DeepEMhancer neural network23 inside the iterative reconstruction algorithm of RELION had to be restricted to only a few iterations at the end of refinement24. The approach in this paper reduces the risk of hallucinations of protein-like features in reconstructions by using a neural network that is trained only on experimental cryo-EM half-maps, that is without the atomic models or the geometrical restraints that are used to describe them.
Instead of forcing the map to resemble densities derived from atomic models, our denoiser is trained to introduce more subtle modifications to cryo-EM maps, such as smoothing out density in solvent regions or in detergent micelles. The network also removes artifacts that are commonly encountered in difficult cryo-EM refinements, for example anisotropic densities that result from uneven angular distributions, or radially extending, streaky features that are often observed in overfitted maps (Figs 1f,g). Our findings illustrate that, although the effect of a single application of the denoiser is relatively small, its cumulative impact over several iterations enhances the performance of cryo-EM structure determination across a diverse range of test cases. As the ability of machine-learning methods to extract knowledge from large datasets improves, it could be tempting to leverage more structural information about biological macromolecules in the reconstruction process. However, doing so could ultimately diminish one of the most powerful ways of assessing whether a reconstruction is correct: the presence of expected features in the map. We thus anticipate that the cryo-EM community will continue to explore the question of how much prior knowledge should inform the reconstruction process, and how much should be kept aside for validation.
In the framework of Blush regularization, the denoiser replaces the filter operation that constrains the power of Fourier-space components in the baseline algorithm. As a result, the FSC between independently refined subsets is no longer used to define a 3D Wiener filter that is applied to the intermediate reconstructions. Instead, this FSC is used to determine a resolution cut-off (ρ), beyond which the Fourier components of the two denoised half-maps are set to zero. Because Fourier components near the resolution estimate of the final map will not have been affected by the denoiser, overestimation of resolution owing to the denoiser cannot happen directly.
Although spectral trailing represents the first attempt to prevent overestimation of resolution when using information-rich priors in cryo-EM reconstruction, it might not be the optimal solution. In fact, as exemplified by the PfCRT dataset (Fig. 2), spectral trailing can lead to underestimation of resolution. Future exploration of the damping effect of the network in Fourier space could lead to better approaches to safeguard against overestimation of resolution. Other research topics that might be worth exploring include the adaptation of the VDAM algorithm20 in Relion to also use Blush regularization, which may improve initial model generation. In fact, provided that they allow modification of real-space maps, a wide range of cryo-EM methods could be improved by Blush regularization, ranging from standard refinement approaches in alternative software packages to approaches for dealing specifically with structural heterogeneity, for example25–27.
In all our tests, the performance of Blush regularization surpassed or matched that of the baseline implementation in RELION. We observed the largest differences for cases in which the baseline approach tended to overfit the data. Consequently, Blush regularization will be most useful for refinements of datasets with low signal-to-noise ratios, such as those of small complexes or complexes embedded in thick ice layers, multi-body refinements involving relatively small bodies and refinements of maps exhibiting pronounced variations in local resolution. For example, Blush regularization allowed reconstruction of an amyloid with only 30 residues in its ordered core, and of the Aca2–RNA complex with a molecular weight of 40 kDa. Although nucleic acids result in higher signal-to-noise ratios than do proteins, 40 kDa approaches predicted minimal sizes for a protein that is amenable to cryo-EM structure determination28,29. These results demonstrate that denoising convolutional neural networks expands the applicability of cryo-EM structure determination .
Methods
Rationale
The noise2noise framework8 facilitates the training of a denoising convolutional neural network in the absence of explicit access to ground-truth images. Instead, it relies on pairs of noisy images to extract information about their shared signal. Here, we present an application-specific approach that incorporates this aspect from the noise2noise framework. We trained a denoiser on a set of 422 pairs of noisy half-maps that we downloaded from the EMDB30. We selected only entries with reported resolutions higher than 4 Å for which both unfiltered half-maps were deposited. Maps with obvious artifacts, for example those associated with overfitting, and maps of a structure that was already present in the training set were eliminated during manual curation.
We tailored data augmentation and training of the denoiser to integrate with the iterative expectation-maximization algorithm for cryo-EM reconstruction. All pairs of half-maps, , with k ∈ {0,1}, were re-scaled to a uniform voxel size of 1.5 Å, and augmented by generating new pairs :
1 |
2 |
where is random colored noise, Mi ∈ [0, 1]N is a smooth mask encapsulating the molecules of interest, ⊙ represents voxel-wise multiplication and h(.) is a low-pass filter to 15 Å. HC,A[.] applies an anisotropic Gaussian filter with covariance matrix C, an affine transform A that includes rotation and translation, a crop to a patch of 643 voxels and a voxel-value standardization. Data augmentation was achieved through random assignments of and r.
By using a range of resolution cut-offs for C and , the denoiser explicitly learns to handle maps with varying resolutions. This is necessary for its application inside the iterative expectation-maximization algorithm, which typically starts at relatively low resolutions and gradually progresses to higher resolutions. Although using a lower resolution cut-off for C than for could have produced a network that enhances the resolution of the half-maps, similar to deblurring networks31, we opted not to do so to minimize the risk of hallucinations in high-resolution features.
Using different degrees of anisotropy in C and , the denoiser learns to deal with the artifacts that arise from non-uniform orientational distributions, and random orientations and affine transformations in A lead to invariance with respect to rotations, translations and intensity scale. Although initial versions of our training protocol did not include masks, we observed that the resulting networks would learn to smoothen densities in disordered regions, such as the solvent or detergent micelles, which would improve image alignments. To amplify these effects, we then implemented the supervised masking approach with Mi and h(.). By filling disordered regions with a 15-Å-resolution low-pass filtered version of the map, as opposed to a straightforward voxel-wise multiplication with the mask Mi, higher density values in regions with disordered molecules, such as detergent micelles, are maintained.
By re-scaling all maps to a common voxel size of 1.5 Å, and then cropping maps to patches of 643 voxels, the network can be trained on and applied to maps of any size. To apply the denoiser to maps that are larger than one patch, overlapping patches can be denoised independently.
Training the denoiser
Our denoiser (fθ) consists of a U-net with approximately 13 million trainable parameters (θ) (Fig. 1). It is trained using residual learning32 and with a dropout rate of 50% (ref. 33). Instance normalization34 is used to handle small mini-batches (), with b = 8 samples from the training dataset, during training. We minimize the following loss:
3 |
where Rr[fθ,y] returns the output of the denoiser fθ after recursively calling it r ∈ {0, …, 5} times with as the initial input. This enables the denoiser to recognize and suppress artifacts brought about by its repeated usage, thereby limiting the amplification of artifacts in the reconstruction that are introduced by the denoiser during subsequent iterations of the expectation-maximization algorithm7.
Training for 950,000 steps took six days using a single Nvidia A100 GPU.
Iterative denoising with spectral trailing
We refer to the application of our pre-trained denoiser within the iterative expectation-maximization algorithm as Blush regularization. In our original work, with simulated data, we incorporated the denoiser into the L2 regularization in the M-step, on the basis of the approximation that the prior function is ‘close’ to a Gaussian7. In this work, we do not make formal claims about the role of the denoiser within a Bayesian framework. Instead, our approach is motivated by empirical observations.
Although one effect of the denoiser is that it tends to dampen Fourier components at higher spatial frequencies, the amount by which it does so is not well defined. Therefore, we use a heuristic method, here referred to as spectral trailing, to prevent overfitting in 3D autorefinement and multi-body refinement. First, we calculate the FSC between two independently refined half-maps before the denoiser is applied, and determine the ρ value at which the solvent-corrected FSC drops below 0.143. We then apply the denoiser to both half-maps and subsequently apply a low-pass filter at a spatial frequency that is two Fourier shells (each shell is one Fourier voxel wide) lower than ρ. If ρ exceeds the Nyquist frequency of the denoiser, here set to 3 Å, the remaining Fourier shells at higher frequencies are populated with the reconstruction from the standard regularization in Fourier space. The resulting denoised, low-pass-filtered maps are then used as references for alignment in the next iteration. The denoiser is not applied to the output of the final refinement step.
Blush regularization has been implemented in the open-source software RELION-5, using a combination of C++ and PyTorch. It can be used for 3D classification, multi-body refinement and 3D autorefinement jobs, including those for particles with point-group or helical symmetry. For 3D classification for data that are separated into independent half-sets, the filtered map from the regularized likelihood approach is used as input for the denoiser. No additional low-pass filtering is applied. In this job type, the denoiser is also applied in the last iteration.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41592-024-02304-8.
Supplementary information
Acknowledgements
We thank J. Schwab, K. Yamashita, C.-B. Schönlieb and O. Öktem for helpful discussions; J. Grimmett, T. Darling and I. Clayson for help with high-performance computing; the EM facility at the Medical Research Council Laboratory of Molecular Biology for support with cryo-EM; N. Birkholz and P. Fineran for input into the design and production of Aca2–RNA complexes, funded by Bioprotection Aotearoa, Centre of Research Excellence (Tertiary Education Commission, New Zealand); and E. Brignole and C. Borsa for the smooth running of the MIT.nano cryo-EM facility, established in part with financial support from the Arnold and Mabel Beckman Foundation. M.E.W. is grateful to F. Zhang for funding support. T.N. is a member of the JEOL YOKOGUSHI Research Alliance Laboratories. This work was supported by the Medical Research Council as part of the United Kingdom Research and Innovation (MC_UP_A025_1013 to S.H.W.S.); the European Union’s Horizon 2020 research and innovation program (under grant agreement no. 895412 to D.K.); a Helen Hay Whitney Foundation Postdoctoral Fellowship (to M.E.W.); the Howard Hughes Medical Institute (to M.E.W.) and a Research Fellowship at Gonville and Caius College of Cambridge University (to V.V.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. For the purpose of open access, the MRC Laboratory of Molecular Biology has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising.
Extended data
Author contributions
D.K. designed and implemented Blush regularization, ran most experiments and analyzed the results. K.J. contributed to data preprocessing. M.E.W. contributed and analysed the Aca2–RNA dataset, and contributed to analysis of the PfCRT dataset. S.L. contributed the FIA dataset. V.V. contributed the Ste2 dataset. T.N. analysed the results. S.H.W.S. supervised the project and contributed to RELION integration. All authors contributed to the writing of the manuscript.
Peer review
Peer review information
Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Arunima Singh, in collaboration with the Nature Methods team. Peer reviewer reports are available.
Data availability
The full list of EMDB entries that were used to train the denoiser, along with the manually curated masks, can be downloaded from https://zenodo.org/records/10553452 (ref. 35). The Aca2–RNA dataset has been submitted to EMPIAR (EMPIAR-11918).
Code availability
Blush regularization has been implemented in the open-source software RELION-5, which is distributed for free under the GPLv2 license and can be downloaded from https://github.com/3dem/relion. Additionally, code used in the training procedure of the Blush denoiser model is available at https://github.com/dkimanius/blush-training.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Dari Kimanius, Email: dari.kimanius@czii.org.
Sjors H. W. Scheres, Email: scheres@mrc-lmb.cam.ac.uk
Extended data
is available for this paper at 10.1038/s41592-024-02304-8.
Supplementary information
The online version contains supplementary material available at 10.1038/s41592-024-02304-8.
References
- 1.Kühlbrandt W. The resolution revolution. Science. 2014;343:1443–1444. doi: 10.1126/science.1251652. [DOI] [PubMed] [Google Scholar]
- 2.Wu X, Rapoport TA. Cryo-EM structure determination of small proteins by nanobody-binding scaffolds (legobodies) Proc. Natl Acad. Sci. USA. 2021;118:e2115001118. doi: 10.1073/pnas.2115001118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Scheres SHW. A Bayesian view on cryo-em structure determination. J. Mol. Biol. 2012;415:406–418. doi: 10.1016/j.jmb.2011.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Scheres SHW. Relion: implementation of a bayesian approach to cryo-em structure determination. J. Struct. Biol. 2012;180:519–530. doi: 10.1016/j.jsb.2012.09.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Punjani A, Rubinstein JL, Fleet DJ, Brubaker MA. cryosparc: algorithms for rapid unsupervised cryo-em structure determination. Nat. Methods. 2017;14:290–296. doi: 10.1038/nmeth.4169. [DOI] [PubMed] [Google Scholar]
- 6.Romano Y, Elad M, Milanfar P. The little engine that could: regularization by denoising (red) SIAM J. Imaging Sci. 2017;10:1804–1844. doi: 10.1137/16M1102884. [DOI] [Google Scholar]
- 7.Kimanius D, et al. Exploiting prior knowledge about biological macromolecules in cryo-EM structure determination. IUCrJ. 2021;8:60–75. doi: 10.1107/S2052252520014384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Lehtinen, J. et al. Noise2noise: learning image restoration without clean data. Preprint at arXiv 10.48550/arXiv.1803.04189 (2018).
- 9.Iudin A, et al. Empiar: the electron microscopy public image archive. Nucleic Acids Res. 2023;51:D1503–D1511. doi: 10.1093/nar/gkac1062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kim J, et al. Structure and drug resistance of the Plasmodium falciparum transporter PfCRT. Nature. 2019;576:315–320. doi: 10.1038/s41586-019-1795-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ramlaul K, Palmer CM, Nakane T, Aylett CHS. Mitigating local over-fitting during single particle reconstruction with sidesplitter. J. Struct. Biol. 2020;211:107545. doi: 10.1016/j.jsb.2020.107545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Punjani A, Zhang H, Fleet DJ. Non-uniform refinement: adaptive regularization improves single-particle cryo-em reconstruction. Nat. Methods. 2020;17:1214–1221. doi: 10.1038/s41592-020-00990-8. [DOI] [PubMed] [Google Scholar]
- 13.Yamashita K, Palmer CM, Burnley T, Murshudov GN. Cryo-EM single-particle structure refinement and map calculation using servalcat. Acta Crystallogr. D Biol. Crystallogr. 2021;77:1282–1291. doi: 10.1107/S2059798321009475. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Jamali, K., Kimanius, D. & Scheres, S. H. W. A graph neural network approach to automated model building in cryo-EM maps. In The Eleventh International Conference on Learning Representationshttps://openreview.net/forum?id=65XDF_nwI61 (ICLR, 2023).
- 15.Chen S, et al. High-resolution noise substitution to measure overfitting and validate resolution in 3D structure determination by single particle electron cryomicroscopy. Ultramicroscopy. 2013;135:24–35. doi: 10.1016/j.ultramic.2013.06.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Velazhahan V, Ma N, Vaidehi N, Tate CG. Activation mechanism of the class Dfungal GPCR dimer STE2. Nature. 2022;603:743–748. doi: 10.1038/s41586-022-04498-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Nakane T, Kimanius D, Lindahl E, Scheres SjorsHW. Characterisation of molecular motions in cryo-EM single-particle data by multi-body refinement in relion. eLife. 2018;7:e36861. doi: 10.7554/eLife.36861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Plaschka C, Lin Pei-Chun, Nagai K. Structure of a pre-catalytic spliceosome. Nature. 2017;546:617–621. doi: 10.1038/nature22799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lövestam S, et al. Disease-specific tau filaments assemble via polymorphic intermediates. Nature. 2024;625:119–125. doi: 10.1038/s41586-023-06788-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kimanius D, Dong L, Sharov G, Nakane T, Scheres SHW. New tools for automated cryo-EM single-particle analysis in relion-4.0. Biochem. J. 2021;478:4169–4185. doi: 10.1042/BCJ20210708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zivanov J, et al. New tools for automated high-resolution cryo-EM structure determination in relion-3. eLife. 2018;7:e42166. doi: 10.7554/eLife.42166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tegunov D, Xue L, Dienemann C, Cramer P, Mahamid J. Multi-particle cryo-EM refinement with m visualizes ribosome-antibiotic complex at 3.5 Å in cells. Nat. Methods. 2021;18:186–193. doi: 10.1038/s41592-020-01054-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Sanchez-Garcia R, et al. Deepemhancer: a deep learning solution for cryo-EM volume post-processing. Commun. Biol. 2021;4:874. doi: 10.1038/s42003-021-02399-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ramirez-Aportela, E., Carazo, J. M. & Sorzano, C. O. S. Higher resolution in cryo-EM by the combination of macromolecular prior knowledge and image-processing tools. IUCrJ9, 632–638 (2022). [DOI] [PMC free article] [PubMed]
- 25.Punjani, A. & Fleet, D. J. 3DFlex: determining structure and motion of flexible proteins from cryo-EM. Nat. Methods20, 860–870 (2023). [DOI] [PMC free article] [PubMed]
- 26.Herreros D, et al. Estimating conformational landscapes from cryo-EM particles by 3D zernike polynomials. Nat. Commun. 2023;14:154. doi: 10.1038/s41467-023-35791-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Kimanius D, Jamali K, Scheres SHW. Sparse Fourier backpropagation in cryo-EM reconstruction. Adv. Neural. Inf. Process. Syst. 2022;35:12395–12408. [Google Scholar]
- 28.Henderson R. The potential and limitations of neutrons, electrons and X-rays for atomic resolution microscopy of unstained biological molecules. Q. Rev. Biophys. 1995;28:171–193. doi: 10.1017/S003358350000305X. [DOI] [PubMed] [Google Scholar]
- 29.Dickerson JL, Lu Peng-Han, Hristov D, Dunin-Borkowski RE, Russo CJ. Imaging biological macromolecules in thick specimens: the role of inelastic scattering in cryoem. Ultramicroscopy. 2022;237:113510. doi: 10.1016/j.ultramic.2022.113510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Lawson CL, et al. Emdatabank unified data resource for 3DEM. Nucleic Acids Res. 2016;44:D396–D403. doi: 10.1093/nar/gkv1126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Albluwi, F., Krylov, V. A. & Dahyot, R. Image deblurring and super-resolution using deep convolutional neural networks. In 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP) 1–6 (IEEE, 2018).
- 32.Zhang K, Zuo W, Chen Y, Meng D, Zhang L. Beyond a Gaussian denoiser: residual learning of deep cnn for image denoising. IEEE Trans. Image Process. 2017;26:3142–3155. doi: 10.1109/TIP.2017.2662206. [DOI] [PubMed] [Google Scholar]
- 33.Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. R. Improving neural networks by preventing co-adaptation of feature detectors. Preprint at arXiv10.48550/arXiv.1207.0580 (2012).
- 34.Ulyanov, D., Vedaldi, A. & Lempitsky, V. Instance normalization: The missing ingredient for fast stylization. Preprint at arXiv10.48550/arXiv.1607.08022 (2016).
- 35.Kimanius, D. Blush training dataset masks. Zenodo 10.5281/zenodo.10553451 (2024).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The full list of EMDB entries that were used to train the denoiser, along with the manually curated masks, can be downloaded from https://zenodo.org/records/10553452 (ref. 35). The Aca2–RNA dataset has been submitted to EMPIAR (EMPIAR-11918).
Blush regularization has been implemented in the open-source software RELION-5, which is distributed for free under the GPLv2 license and can be downloaded from https://github.com/3dem/relion. Additionally, code used in the training procedure of the Blush denoiser model is available at https://github.com/dkimanius/blush-training.