DeepEM Playground: Bringing deep learning to electron microscopy labs

Hannah Kniesel; Poonam Poonam; Tristan Payer; Tim Bergner; Pedro Hermosilla; Timo Ropinski

doi:10.1111/jmi.70005

. 2025 Jun 28;299(3):287–300. doi: 10.1111/jmi.70005

DeepEM Playground: Bringing deep learning to electron microscopy labs

Hannah Kniesel ^1,^✉, Poonam Poonam ¹, Tristan Payer ¹, Tim Bergner ², Pedro Hermosilla ³, Timo Ropinski ¹

PMCID: PMC12352014 PMID: 40579897

Abstract

Deep learning (DL) has transformed image analysis, enabling breakthroughs in segmentation, object detection, and classification. However, a gap persists between cutting‐edge DL research and its practical adoption in electron microscopy (EM) labs. This is largely due to the inaccessibility of DL methods for EM specialists and the expertise required to interpret model outputs.

To bridge this gap, we introduce DeepEM Playground, an interactive, user‐friendly platform designed to empower EM researchers – regardless of coding experience – to train, tune, and apply DL models. By providing a guided, hands‐on approach, DeepEM Playground enables users to explore the workings of DL in EM, facilitating both first‐time engagement and more advanced model customisation.

The DeepEM Playground lowers the barrier to entry and fosters a deeper understanding of deep learning, thereby enabling the EM community to integrate AI‐driven analysis into their workflows more confidently and effectively.

Keywords: computer vision, deep learning, electron microscopy

1. INTRODUCTION

EM has long been a cornerstone of scientific research, enabling the visualisation of structures at the nanometre scale. However, the interpretation and analysis of EM micrographs often demand significant manual effort and expertise. ¹ , ² , ³ Meanwhile, in recent years, DL has revolutionised Computer Vision offering automated and efficient solutions for complex image analysis tasks such as segmentation, object detection, and classification. Hence, the integration of DL into EM image analysis holds immense potential to enhance accuracy, speed, and reproducibility in data interpretation.

However, applying DL to EM data presents unique challenges. Unlike natural images, which are typically colour‐scaled and formed by light interactions, EM images are greyscale and formed entirely different. Contrast in EM images arises from interactions between the electron beam and the sample's atomic nuclei and electron clouds, leading to scattering and absorption that influence the detected electron intensity. Additionally, different EM modalities, such as TEM, STEM and SEM, introduce further variations in image formation, making it difficult to directly apply DL models trained on real‐world photographs.

As a result, DL for EM has emerged as a dedicated research field, developing tailored methods for EM image analysis. For instance, CNN have been employed for the automatic detection ⁴ , ⁵ , ⁶ and segmentation ⁷ , ⁸ , ⁹ , ¹⁰ , ¹¹ , ¹² of cellular structures in electron micrographs. Additionally, multiple DL methods were showing promising results in tasks such as single particle reconstruction and tomographic reconstruction. ¹³ , ¹⁴ , ¹⁵ , ¹⁶ , ¹⁷ Moreover, DL models have been applied to classify different types of nanoparticles ¹⁸ and to identify defects in materials at the atomic scale. ¹⁹ Additionally, Belevich et al. (2016) ²⁰ integrate deep learning‐based automated image segmentation and analysis tools into the user friendly Microscopy Image Browser, streamlining the processing of large, multidimensional microscopy datasets.

In this context, a deeper understanding of DL methods by EM experts can significantly improve both the design and performance of DL‐based approaches, as well as the effective use of DL tools. Familiarity with the underlying principles of DL enables more accurate interpretation and evaluation of black box model predictions, helping researchers to better trust and troubleshoot their outputs. Moreover, when EM researchers consider factors such as class imbalance, structurally relevant features, and potential sources of bias during method development, they are better equipped to create well‐structured training datasets and procedures, leading to more robust and reliable models. However, implementing such customisations often requires coding expertise, which remains a barrier for those without programming experience. Previous efforts ²¹ have aimed to make DL more accessible to the EM community, and the present work can be seen as a natural extension of these initiatives.

We introduce DeepEM Playground, a user‐friendly, interactive platform (see Figure 1 or at https://viscom‐ulm.github.io/DeepEM/) designed to help EM researchers train, test, and develop DL models using state‐of‐the‐art methods, without requiring any coding experience. Built on a centralised web interface, the platform organises EM‐specific use cases into clearly defined task categories. A use case is a practical, task‐oriented example, contributed by DL experts, that allows users to train, adapt and evaluate their own model. Each use case follows a standardised workflow to ease the learning curve, which is detailed both in this paper and on the website. The workflow integrates a data annotation interface, allowing users to annotate their own lab‐specific data to train and evaluate a model. We acknowledge that the audience of DeepEM Playground workflow includes both EM experts, as users, and DL experts, as contributors, who may have different levels of familiarity with the domain. To address this, we provide targeted explanations and resources for each group on our website, ensuring that both can easily navigate, get started and and benefit from the DeepEM Playground. Importantly, DeepEM Playground runs on a cloud‐based infrastructure, eliminating the need for local high‐performance computing resources. Additionally, it provides a framework for DL experts to contribute their own methods, making advanced tools accessible to both the EM and DL communities.

The DeepEM Playground start page. A user‐friendly platform designed to bridge the gap between DL research and its practical application in EM labs. For more information, visit https://viscom‐ulm.github.io/DeepEM/.

2. USE CASES

In the context of our DeepEM Playground, we group use cases into three distinct, EM‐specific tasks: ‘Image to Value(s)’, ‘Image to Image’, and ‘2D to 3D’ (see Figure 2). Hence, we refer to a task as a broad category of problems or objectives that share similar input and output data types, while we refer to a use case as a specific instance or scenario within a task, where the same type of input and output data is applied to address a particular problem. Our DeepEM Playground comes with one example use case for each task. A use case is defined by its primary focus, hence the broad area of application for the use case (i.e. segmentation). For each use case an exemplary application is implemented based on exemplary data provided (i.e. segmentation of cellular structures). Playground users are able to adapt the application area of the use case within its primary focus.

Use cases in the area of EM data analysis can be categorised by the type of input data and the output data of the DL method into three different tasks: *Image to Value(s)*, *Image to Image* and *2D to 3D*. We will discuss the tasks in detail in the following while providing exemplary notebooks of specific use cases for each group.

In the following, we discuss use cases of applying DL to EM for each task. We further provide more in depth discussion of the implemented example use cases.

2.1. Image to Value(s)

Image to Value(s) tasks are defined by their image input and the output of a single or multiple values. Common examples involve classification, regression or detection. Classification in the context of EM refers to the process of categorising EM images or their specific regions into predefined classes based on their visual characteristics. For example, it can be used to identify ‘good’ or ‘bad’ imaging regions of the sample of interest. ²² This is done by making the model predict a probability distribution which models the probability of the input image to belong to a predefined set of classes (for example $C = {‘good”, ‘bad”}$ ). Regression tasks in EM refer to a type of predictive modelling technique used to predict a continuous output variable based on an input micrograph. Unlike classification, which assigns discrete classes to the input data, regression outputs a continuous value. This technique is particularly valuable for tasks that require quantifying certain properties of an EM image, such as the number of visible virus particles. Lastly, detection refers to the process of identifying and locating specific objects or features by a bounding box within an image. Unlike simple classification, which assigns labels to entire images, or regression, which predicts continuous values, object detection combines both tasks: It involves pinpointing exact bounding boxes by regressing its position and size, as well as classifying the object located within the bounding box. It allows for deriving information about position, count and sizes of the detected objects. This process is essential for tasks where understanding spatial distributions and feature characteristics, such as of virus particles within a micrograph, are critical.

2.1.1. Application in EM

DL has been successfully applied to various ‘Image to Value’ tasks in EM, demonstrating significant improvements in efficiency and accuracy. One notable area of application is particle picking in cryo‐EM, a crucial preprocessing step for single‐particle analysis. Traditionally, this process has been labor‐intensive and prone to human error. DeepPicker ²³ and DeepCryoPicker ²⁴ represent two significant advancements in this field. DeepPicker utilises a DL approach with a sliding window classifier to automate particle picking. What distinguishes DeepPicker is its innovative cross‐molecule training strategy, which learns common features from previously analysed micrographs. This strategy eliminates the need for human intervention during particle picking, thereby accelerating the analysis process and improving consistency. Similarly, DeepCryoPicker ²⁴ enhances particle picking by leveraging automated unsupervised learning to generate training datasets and then training a DNN for particle classification. This hybrid approach combines supervised DL with automated unsupervised clustering, resulting in accurate particle picking capabilities in cryo‐EM studies. ²⁴

Another critical task in EM is the identification of ‘good’ regions within cryo‐EM images, where the quality and thickness of the ice film directly impacts the resolution of the structure of interest. Traditionally, researchers manually select regions with optimal ice thickness, which is time‐consuming and subjective. To address this challenge, a DL‐based method was developed to automatically identify these regions. ²² This system consists of a detector and a classifier trained to recognise ‘good’ regions from low‐magnification EM images. By reducing the reliance on manual selection, this approach streamlines the imaging process and ensures more consistent high‐quality data acquisition for structural determination.

In virus research, virus particle detection is crucial for understanding infection mechanisms and developing therapeutic strategies. DL has enabled significant advancements in this area as well. For instance, a weakly supervised approach has been developed to detect virus capsids in EM images using minimal annotation effort. ⁴ By leveraging binary annotations indicating the presence or absence of virus particles, this method efficiently identifies virus particles without the need for extensive manual annotations, thereby accelerating virus research and analysis.

Moreover, in the context of detecting the cytoplasmic envelopment stages of human cytomegalovirus (HCMV), a DL network was trained using augmented datasets that include synthetic, labelled images generated by a generative adversarial network (GAN). ⁵ This augmentation strategy addresses the challenge of limited annotated data and enhances the network's ability to accurately detect and classify virus maturation stages in EM images.

2.1.2. Example use case: explainable virus quantification

We develop a regression model (see Figure 3) to quantify HCMV capsids during secondary envelopment in TEM images. This process, essential for viral maturation, involves capsids acquiring a final envelope by budding into cytoplasmic vesicles. ²⁵ TEM images capture different maturation stages – naked, budding, and enveloped capsids – enabling comparative analysis between wild‐type and mutant viruses with envelopment defects. ²⁶ , ²⁷ , ²⁸ To preserve high‐resolution details, we avoid downscaling images before training. Instead, we extract $224 \times 224$ crops at random positions during training to match the model input. We leverage a ResNet50 ²⁹ initialised with CEM500k ³⁰ model weights. During inference, images are resized to the nearest multiple of 224 in both dimensions and processed using a non‐overlapping sliding window approach. Finally, the trained model can count the number of naked, budding, and enveloped capsids in input images. To enhance trustworthiness and facilitate error detection, we apply Grad‐CAM, ³¹ a visualisation technique that highlights the most relevant regions in an image influencing the model's predictions and providing insight into its decision‐making process. The model architecture and training presented here have not been previously published. However, we use publicly available data from Ref. (5), with location‐based labels. This dataset serves as a representative example for illustrating the use case within DeepEM Playground. This use case can be applied to any other counting task by exchanging the training data.

Overview of the explainable virus quantification use case. A regression model is trained using data from Ref. (5) to predict the number of naked, budding, and enveloped virus capsids in an image. GradCAM is employed to visualise the areas the model focuses on to make its predictions, helping to detect errors and increase the model's trustworthiness.

2.2. Image to Image

Image to Image tasks are defined by an image input as well as an output also in form of an image. Common examples involve transforming the input image into a new image representation. These tasks are fundamental in various applications within EM, where enhancing, restoring, or analysing images is crucial for extracting valuable information from EM data. In denoising, the noisy input image is translated into a noise free version. In super‐resolution, a low‐resolution micrograph is translated into a high‐resolution micrograph, thereby enhancing the detail and clarity of the observed structures. Lastly, in segmentation, the input micrograph is translated into a segmented image where different regions represent distinct components, such as certain cellular organelles or virus particles. This involves the classification of each pixel in the input micrograph. Semantic segmentation refers to the process where each pixel in the image is classified into a predefined category (e.g. ‘nucleus’, ‘mitochondria’, or ‘virus particle'). In this case, pixels that belong to the same category are grouped together, forming regions or segments that represent parts of the sample, but without distinguishing between separate instances of the same category. For example, all mitochondria in the image would be labelled the same, regardless of how many individual mitochondria are present. In contrast, instance segmentation not only classifies each pixel but also distinguishes between individual instances of the same category. In this case, even if multiple objects belong to the same class (e.g. multiple virus particles), each instance is given a unique label. This allows for a more detailed segmentation, where each distinct object in the image, even if it is part of the same category, is recognised as a separate entity. Panoptic segmentation combines the strengths of both semantic and instance segmentation. It provides a unified framework where every pixel is assigned a class label (like in semantic segmentation), but it also distinguishes between different instances of the same class (as in instance segmentation). The key difference is that panoptic segmentation ensures every pixel in the image is labelled as either a part of an object instance or a background region, making it a comprehensive approach for segmenting both things (distinct object instances) and stuff (amorphous regions like backgrounds or tissue regions). In all three types of segmentation, segments formed by adjacent groups of uniformly classified pixels are typically labelled, providing a clear distinction between different parts of the sample. However, instance segmentation and panoptic segmentation go a step further by assigning unique identifiers to individual objects, with panoptic segmentation also handling the background and non‐object regions.

2.2.1. Application in EM

DL has significantly advanced Image to Image tasks in EM, tackling challenges such as denoising, super‐resolution, and segmentation with impressive results. Traditionally, denoising EM images required pairs of high and low‐quality micrographs for training, which were acquired through manual curation or synthetic data generation. Breakthroughs in DL, however, have eliminated the need for clean‐noisy image pairs. Techniques like Noise2Noise, ³² Noise2Self, ³³ and Noise2Void ³⁴ have shown that denoising can be achieved using only noisy images, leveraging the network's ability to learn from inherent data distributions. Methods such as Deep Image Prior ³⁵ take a different approach by utilising the network architecture itself to reconstruct clean images. Some works in standard Computer Vision even focus on making these methods applicable to domains with little training data and limited compute. ³⁶ , ³⁷ Some of these methods have been successfully applied in the domain of EM like Cryo‐CARE ³⁸ and Topaz‐Denoise ³⁹ for cryo‐EM images. These methods not only improve SNR but also enhance the fidelity of structural details or remove artefacts. ⁴⁰

In super‐resolution tasks, DL models play a pivotal role in enhancing the spatial resolution of EM images beyond the diffraction limit. Techniques developed by Suveer et al. (2019) ⁴¹ and Fang et al. (2021) ⁴² utilise neural networks to reconstruct high‐resolution images from lower‐resolution inputs, facilitating detailed biological analysis that was previously constrained by imaging limits. These methods often rely on synthetic data or downsampling strategies or computational degrading high resolution images to train models effectively. Some researchers also use physical experiments to generate real high and low resolution image pairs. ⁴² In another line of work, researchers make use of GAN models for resolution enhancement, to finally be able to speed up SEM image acquisition by reducing electron charging and sample damage. ⁴³

Semantic segmentation, essential for identifying and delineating cellular structures in EM, benefits greatly from DL advancements. Architectures such as U‐Net ⁴⁴ and its variants have proven instrumental in automating the segmentation of biological structures from EM data. Many works follow for the segmentation of different structures in varying contexts of EM ⁸ , ⁹ , ¹⁰ , ¹¹ , ¹² including 2D and 3D segmentation. These approaches not only improve segmentation accuracy but also enhance the efficiency of data analysis, empowering researchers to extract meaningful quantitative information or segmentation masks required for 3D visualisations from EM datasets with unprecedented speed and reliability. Studies like EM‐stellar ⁴⁵ offer bench‐marking these existing segmentation works making it possible to identify advantages and disadvantages. Finally, DeepETPicker ⁴⁶ interpret the task of picking particles from cryo‐ET data for solving 3D structures of biomacromolecules in situ as 3D segmentation and show promising results to speed up data preprocessing regarding subtomogram averaging.

2.2.2. Example use case: semantic segmentation of cellular structures

We address the semantic segmentation of cellular structures in EM images as an Image‐to‐Image task, using an encoder‐decoder network based on U‐Net ⁴⁴ to generate semantic masks. Following Ref. (7), we develop an ensemble model combining multiple CNN backbones, weighting predictions based on validation performance of each model to improve robustness, particularly given the small dataset size (see Figure 4). We utilise real data from Ref. (47) and encourage researchers to apply our pipeline to their own annotated datasets. Pixel‐level masks distinguish cellular structures such as cytoplasm and background. The dataset is split into 60%–20%–20% for training, validation, and testing, with augmentations (flipping, rotation, transposition, and grid distortion) applied on‐the‐fly to enhance variance. As we are working with an ensemble model which relies on multiple different backbones, we are not able to leverage the EM pretrained weights of CEM500k ³⁰ as they are only provided for ResNet‐50 backbones. Instead, we leverage pretrained models on natural images (ImageNet ⁴⁷ ). Although there is a domain gap, pretrained weights from models trained on natural images can still be beneficial because they are trained on large datasets, which helps the model learn useful feature representations. This use case can be adapted to any other semantic segmentation task, by exchanging the training data.

Overview of the semantic segmentation of cellular structures use case. An ensemble model combines multiple CNN backbones, as described in Ref. (7). The ensemble models predictions are a weighted average of the predictions of each model, based on it corresponding validation performance. This has been shown to enhance robustness, ⁷ especially when working with small dataset sizes.

2.3. 2D to 3D

2D to 3D tasks are characterised by their process of converting multiple two‐dimensional (2D) images into a three‐dimensional (3D) representation. These tasks are essential in various fields, such as structural biology and material science, where understanding the 3D structure of samples from 2D projections is crucial. By integrating information from multiple 2D projections, these methods aim to produce an accurate and detailed 3D representation of the sample, enhancing our understanding of its spatial organisation and functional features. Common examples correspond to ET, Subtomogram Averaging and SPR. In this context, Subtomogram averaging is a special case: In essence, it can be interpreted as a 3D to 3D task; however, as its 3D input is based on a tomographic reconstruction of 2D tilt series, we account it as 2D to 3D task. Therefore, the input of ET and Subtomogram Averaging is defined by one or multiple tilt series. For SPR, the input corresponds to a set of 2D picked particles.

2.3.1. Application in EM

DL has significantly advanced 2D to 3D tasks in EM, enabling the reconstruction of 3D structures from 2D tilt series with remarkable accuracy. In tomographic reconstruction, innovative approaches like Clean implicit 3D structure from noisy 2D STEM images ¹³ have been developed. This method simultaneously reconstructs and denoises electron tomographic STEM data, effectively suppressing noise and mitigating the missing wedge effect, resulting in cleaner and more accurate 3D reconstructions.

For SPR, multiple works have been presented. Frameworks like cryoGAN ¹⁵ utilise adversarial learning to model 3D structures from numerous 2D projections. CryoGAN employs a deep adversarial learning scheme to match the distribution of real data with simulated projections, starting from a zero‐valued volume and leveraging a cryo‐EM physics simulator to achieve realistic reconstructions. Additionally, recent advancements such as the work by Shekarforoush et al. ¹⁷ have improved ab initio cryo‐EM reconstruction through semi‐amortised pose inference. This approach estimates the pose and refines it using a multi‐head architecture and SGD optimisation, demonstrating the potential of DL to enhance the accuracy and efficiency of 2D to 3D tasks in EM. A significant breakthrough in the field is the use of neural network architectures that encode structural heterogeneity and optimise both pose and structure jointly. One such approach is cryoDRGN, ¹⁴ which employs an image‐encoder and volume‐decoder structure to handle the variability in cryo‐EM data. CryoDRGN performs heterogeneous reconstruction by learning a deep generative model for 3D volumes and optimises pose and heterogeneity together. Building on this, cryoDRGN2 ¹⁶ addresses computational bottlenecks and inaccuracies in the initial approach, significantly speeding up pose search and achieving state‐of‐the‐art accuracy for ab initio reconstruction of complex cryo‐EM datasets.

Similarly, tomoDRGN ⁴⁸ leverages encoder‐decoder architectures to improve sub‐tomogram averaging in cryo‐ET. This method maps tilt images of each particle into a low‐dimensional latent space, facilitating the iterative refinement of alignment and averaging. By enhancing the SNR through the averaging of multiple tomograms, tomoDRGN enables clearer and more detailed 3D reconstructions from noisy cryo‐ET data.

2.3.2. Example use case: tomographic reconstruction

We implement a self‐supervised DL‐based tomographic reconstruction of a 2D STEM tilt series (see Figure 5), following Ref. (13), to enable 3D visualisation of cellular structures. This approach eliminates the need for annotated data, requiring only a single STEM‐tilt series for reconstruction. Unlike conventional DL methods, the implemented approach is designed to ‘overfit’ to a single reconstruction while generalising to unknown tilt angles, including those within the missing wedge. In conclusion, this method has shown promising results, ¹³ particularly in suppressing the missing wedge effect, outperforming traditional reconstruction techniques such as Weighted Backprojection (WBP) ⁵⁰ and Simultaneous iterative reconstruction technique (SIRT). ⁵¹ However, this benefit comes at the cost of requiring model retraining for each individual reconstruction, as the tilt series itself serves as the training data.

Overview of the 2D‐to‐3D reconstruction process. A DL model predicts density values for 3D positions, while a physics‐based simulator ¹³ , ⁴⁹ generates synthetic micrographs based on the predicted densities. The model is trained to align its predicted micrographs with real micrographs at known tilt angles, enabling it to learn the underlying 3D structure of the sample. Due to the generalisation ability of DNNs, this method has been shown to suppress the missing wedge effect. ¹³

The reconstruction pipeline integrates a DL model with a physics‐based STEM simulator ¹³ , ⁴⁹ to generate synthetic micrographs. The simulator is able to compute an EM image based on the underlying 3D sample, which is learned by the DL model. The model is trained to ensure that the predicted micrographs align with real micrographs at known tilt angles, thereby learning the underlying 3D structure of the sample. To assess performance, we include synthetic datasets with ground truth phantoms ¹³ and evaluate inference on a real tilt series of HCMV‐infected fibroblasts. ²⁸ Preprocessing involves structuring input data into a folder containing individual .tif files for each tilt angle and a .rawtlt file listing the corresponding angles. Additionally, we provide an option for image downscaling to enhance computational efficiency and apply min‐max normalisation to meet the requirements of the STEM image formation process.

Our example use cases underscore the transformative impact of DL on EM. By automating complex tasks, improving accuracy, and reducing manual intervention, DL technologies are advancing our capabilities to explore and understand biological structures at unprecedented levels of detail and efficiency. As these methodologies continue to evolve, they hold promise for further accelerating scientific discovery and innovation in EM and related fields.

2.4. Training setup

Training DL models typically requires powerful workstations with GPU acceleration as well as the setup of a compatible software environment, installing relevant libraries, dependencies and managing datasets – all of which can pose significant challenges for users of the DeepEM Playground. To address this, web‐based platforms such as Google Colab ⁵² and Lightning AI Studios ⁵³ have emerged offering intuitive interfaces that abstract away much of this complexity and allow users to run DL workflows in the cloud. To simplify the deployment and execution of DL models within DeepEM Playground, we leverage Lightning AI Studios, a cloud‐based platform designed to eliminate the need for manual environment configuration or access to specialised hardware. This enables EM researchers to train models efficiently and reproducibly, making DL more accessible to users without programming or systems expertise. While similar to Google Colab, Lightning AI Studios offers several advantages: it avoids the need for repeated environment setup (e.g. reinstalling packages for each session), allows persistent storage of training data and model checkpoints. Additionally, it enables easy sharing of so‐called ‘Teamspaces’ – collaborative environments where multiple users can access and work with the same set of code, data, and model weights. This makes it easier for different labs or team members to contribute to a project, track changes, and run or test models in a shared, consistent setup without needing complex configuration or manual file transfers. By using Lightning AI Studios, DeepEM Playground ensures a scalable and user‐friendly solution for running computationally intensive tasks with minimal setup. A step‐by‐step ‘Getting Started’ guide is available on our website.

2.5. Community involvement

The provided use cases serve as illustrative examples, demonstrating the potential applications of DL in EM. However, they are not intended to represent a comprehensive collection. We therefore encourage the research community to create their own use cases by leveraging the proposed workflow, which simplifies and streamlines the process of training models. Researchers interested in contributing their work are invited to share their implemented code in combination with example data via Lightning Studios and reach out to one of the authors for further collaboration. For more details please see our project page at https://viscom‐ulm.github.io/DeepEM/your‐contribution.html.

3. WORKFLOW

The workflow supports the full DL pipeline in EM, covering development (data handling, model training, and evaluation) and inference (data/model selection and prediction). It provides a consistent, code‐free interface that helps researchers adapt DL models to new tasks while deepening their understanding of DL techniques. To encourage community contributions, a PyTorch‐based deepem library enables DL specialists to implement and extend the workflow more easily (Figure 6).

The DeepEM Playground workflow comprises two main stages: (1) Development and (2) Inference. In the Development stage, data for training, validation, and testing is collected, the model is trained, and its performance is evaluated. In the Inference stage, the trained model is applied to new, unseen data to finally assist in the analysis of EM data.

In the following, we will provide an overview of the workflow and give an insight to its key components.

3.1. Development

In DL for EM, the process of creating and optimising models to address specific challenges within EM is known as development. This process is structured around three key steps:

1.
Data: Preparing high‐quality, well‐annotated datasets tailored to the EM lab's needs.
2.
Model Training: Training models by optimising architectures and parameters.
3.
Model Evaluation: Evaluating the model's performance using task‐specific metrics.

3.1.1. Data

Each contributed use case provides one set of example training data, which can be used to familiarise researchers with the topic and the use case. However, we encourage users to provide their own datasets, learning about the input and output data, as well as annotations. Moreover, this approach allows users to train their own model which better matches the specific needs of their laboratory, ultimately enhancing the model's ability to support EM data analysis.

Data acquisition

The first step in creating a dataset is data acquisition, which involves gathering raw EM images from various imaging modalities such as TEM, SEM, or STEM. It is essential to collect a variety of images that are representative of the research task to ensure the model's ability to generalise well. However, there is no universal guideline for how to collect data, how diverse the dataset should be, or how much data is needed – these decisions depend heavily on the specific task and are skills developed through experience. When applicable, DL experts may provide guidance on the balance and diversity of the dataset within their contributed use case to ensure that it is suitable for the DL model. EM researchers, with their domain expertise, are responsible for assembling diverse and well‐balanced datasets that capture the relevant features of the task.

Data annotation

Data annotation is one of the most time‐consuming yet crucial steps in preparing datasets for DL. It involves labeling raw EM images with relevant information or categories to make them interpretable for machine learning models.

High‐quality annotations enable the model to more effectively recognise and interpret the relevant features in EM images. To support researchers in this process, DeepEM Playground integrates CVAT, ⁵⁴ (Computer Vision Annotation Tool) a browser‐based, open‐source platform for creating labelled datasets with minimal setup. Using CVAT standardises the annotation format across use cases, facilitating data exchange and enabling EM researchers to tailor datasets to their specific needs. This setup allows users to adapt models simply by replacing the training data, making it easy to develop and refine lab‐specific models. A detailed ‘Getting Started’ guide is provided on our website to help users annotate their own data with CVAT. In addition, DL experts are encouraged to provide detailed guides for annotation within the context of their specific use case to ensure consistency, accuracy, and reproducibility.

Data preprocessing

In line with a standardised data annotation format, data preprocessing is essential to prepare the dataset for model training, ensuring that the data is in the correct format and of sufficient quality for optimal performance. This preprocessing includes several steps, such as reformatting and image enhancement, to ensure the dataset aligns with the model's requirements.

Data reformatting: The data must be in a format compatible with the DL training pipeline. DL experts are encouraged to design their use case using standard file formats such as .tif or .mrc, which are widely supported in EM contexts, simplifying data preprocessing for EM experts.
Image enhancement: To improve image quality, techniques such as denoising, contrast adjustment, and image normalisation may be applied. These enhancements are particularly useful for improving model performance and will be documented by the DL expert within each use case if applicable.

DL experts may suggest additional preprocessing steps to enhance image quality or increase the robustness of the model. These steps should be clearly documented in the use case guidelines to help EM researchers properly prepare their data. At the same time, EM researchers often have deeper insights into commonly used tools and techniques for image enhancement, which they can incorporate and test within the workflow to further improve data quality and model performance.

Data structuring

A clear directory structure is essential for ensuring that EM researchers can easily swap out training, validation, and test data for different applications of the use case. During implementation of the use cases, DL experts are advised to implement the data splitting as part of the workflow, so that EM researchers only need to provide a single folder of images without organising the data themselves. Still, in case of any exceptions, clear documentation will be provided for each use case to ensure consistency in data structuring and minimise the risk of errors during dataset preparation.

3.1.2. Model training

Model training is a critical step in developing a DL model. It involves optimising the internal parameters (weights) of the model based on the relationships between input data and corresponding outputs defined in the training set. This process is guided by a loss function, which measures the error between the model's predictions and the true values (labels). The model is iteratively updated to minimise this error by adjusting its parameters, thereby improving its performance over time. The improvement of model performance is typically monitored by validating the model at regular validation intervals. The purpose of validation is to evaluate the model's ability to generalise to new, unseen data, as well as to detect issues such as overfitting. Therefore, validation is carried out on a separate dataset (usually a subset of the provided data), which the model has not seen during training. By monitoring the model's performance on the validation set, we can refine the model and ensure that it will deliver reliable results in real‐world applications.

Hyperparameter tuning

An important aspect of model training is hyperparameter tuning, which involves selecting the optimal values for parameters that influence the model's performance but cannot be optimised during the training process itself. To simplify the hyperparameter tuning process, we offer an automated search option. For users who prefer not to delve into the technical details, a default search space is provided, defined by contributing DL experts.

However, for those more experienced or willing to explore, we provide an option to modify the search space through a simple form. This flexibility allows users to fine‐tune the model's performance according to their specific needs. DL experts will provide explanations regarding the influence of each tunable parameter within the context of the specific use case, helping users make informed decisions during the tuning process.

Training and validation

During training, the model's performance is continuously monitored through an integrated logging system that requires no additional setup, making it easy for EM researchers to use. This system provides valuable insights for both EM and DL experts, helping to identify potential issues and offering a deeper understanding of the model training process. Additionally, the logs track improvements over time and provide a visual representation of the model's progress, enabling users to assess its development and performance effectively.

The logs are stored in a dedicated directory, following the naming convention logs/data‐current‐datetime/, and contain various pieces of information that facilitate the monitoring and evaluation of the model's training process. One key component is the record of hyperparameters, which details the hyperparameters used during the training process, allowing to reproduce the results if needed. Another important element is the model checkpoints, which are snapshots of the model taken at various stages of training. These checkpoints enable users to resume training if needed or to use the best‐performing model for inference.

Additionally, training/validation loss curves are generated to visualise the model's learning progress over time. These graphs are crucial for identifying trends such as overfitting and understanding the model's improvement throughout the training process. To assess the model's visual accuracy, qualitative visualisations are included, which usually consist of sample images paired with model predictions. These visualisations help to quantitatively evaluate how well the model is performing on unseen data.

Together, this logging infrastructure enables users to comprehensively monitor the model's training, evaluate its performance at various stages, and make informed decisions for further improvements or adjustments, as well as learn about the training process of a DL model.

3.1.3. Model evaluation

Model evaluation is a critical step in determining whether the trained model meets the desired performance criteria and is ready for application or requires further refinement. The evaluation process ensures that the model can effectively perform the intended task and provides insights into potential areas for improvement.

Choose model

The DeepEM Playground offers flexibility in evaluating not only the most recent model but also other models with the same checkpoint structure (i.e. trained within the same use case, possibly by other colleagues). This can be done easily by specifying the checkpoint path in a text field within the use case. This feature supports collaboration and sharing of different models, as they can be evaluated across different EM labs.

Evaluation

To evaluate the performance of the model, the DL expert chooses specific evaluation metrics based on the goals of the model. These metrics are designed to highlight the strengths and weaknesses of the model, helping both EM and DL experts understand its effectiveness. It is the responsibility of the DL expert to document the evaluation process thoroughly and provide a well‐defined set of metrics that are relevant to the model's objectives. These metrics must be clearly explained within the use case to ensure that EM experts can interpret the results accurately. Test metrics and visualisations are provided within the logging tool, as described above, including qualitative and quantitative evaluations on the test set.

Although DeepEM Playground serves as a platform to bridge the gap between DL and EM experts, fostering interdisciplinary collaboration and teaching about the workings of DL, it is important to note that no trained model is flawless. We do not take responsibility for any irresponsible usage of the trained models or their predictions.

3.2. Inference

Inference is the process of applying a trained and tested DL model to make predictions on new, unseen data. This step is crucial for leveraging the model's capabilities to support the analysis of EM data.

Define data

To perform inference, the first step is to define the data on which the model will make predictions. This data can either be provided as a single file or as a folder containing multiple files, allowing for efficient processing of larger datasets.

Choose model

Selecting the appropriate model for inference is essential. The model should have undergone a thorough evaluation based on well‐defined performance metrics, as determined by the DL expert. It is important that the model selected for inference has demonstrated strong performance during evaluation, ensuring that the predictions made are reliable and accurate.

Make prediction

Once the model is selected, it can be used to make predictions on the provided data. Again, it is important to note that human oversight is essential to ensure the predictions are plausible and accurate. DL experts are responsible for implementing the inference pipeline, optimising it for performance, and ensuring that any necessary pre‐ and post‐processing steps are included to generate meaningful results for EM specialists. EM experts are responsible for carefully monitoring the predictions of the models to maintain the integrity and reliability of the analysis.

4. CONCLUSION

The DeepEM Playground introduces new opportunities for collaboration between EM and DL researchers by providing a user‐friendly platform that enables EM experts to explore, test, and train models without requiring coding expertise. To enhance accessibility, we have grouped DL‐based use cases into three distinct tasks, providing an example use case for each to illustrate common data analysis principles in EM. The playground allows researchers to easily adapt use cases to their specific lab needs by simply swapping training data, thanks to standardised image annotation formats based on the CVAT annotation tool. This enables EM researchers to not only use pre‐built use cases from DL experts but also to train models on their own annotated, lab‐specific datasets. The standardised workflow further streamlines the integration of state‐of‐the‐art DL methods into EM research. Additionally, the PyTorch‐based deepem library facilitates contributions from DL specialists, adhering to our proposed workflow.

By leveraging Lightning AI Studios, we eliminate the complexities and costs of GPU environments and configurations, making powerful DL tools more accessible to EM researchers and fostering collaboration across disciplines. DeepEM Playground empowers EM specialists to harness DL techniques for more accurate, reproducible, and efficient analysis of EM data, bridging the gap between cutting‐edge DL research and its practical application in EM labs.

We invite the research community to contribute to and build upon this playground, democratising access to DL methodologies and promoting an interdisciplinary approach to scientific discovery. Through the playground, we not only offer a platform for EM specialists to learn about DL without coding, but also help integrate the latest DL developments into real‐world EM applications.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

ACKNOWLEDGEMENTS

The authors would like to thank Jens von Einem (Institute of Virology, Ulm University Medical Center) for providing HCMV infected cells, and Clarissa Read (Central Facility for Electron Microscopy, Ulm University) for her help and valuable discussions on the topic, improving the playground. Moreover, we would like to thank Julia La Roche for her support by testing the playground from the perspective of an EM expert.

Open access funding enabled and organized by Projekt DEAL.

Kniesel, H. , Poonam, P. , Payer, T. , Bergner, T. , Hermosilla, P. , & Ropinski, T. (2025). DeepEM Playground: Bringing deep learning to electron microscopy labs. Journal of Microscopy, 299, 287–300. 10.1111/jmi.70005

REFERENCES

1. Gu, Y. , Shen, M. , Yang, J. , & Yang, G.‐Z. (2018). Reliable label‐efficient learning for biomedical image recognition. IEEE Transactions on Biomedical Engineering, 66(9), 2423–2432. [DOI] [PubMed] [Google Scholar]
2. Yakimovich, A. , Beaugnon, A. , Huang, Y. , & Ozkirimli, E. (2021). Labels in a haystack: Approaches beyond supervised learning in biomedical applications. Patterns, 2(12). [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Bruno, G. , Lynch, M. J. , Jacobs, R. , Morgan, D. D. , & Field, K. G. (2023). Evaluation of human‐bias in labeling of ambiguous features in electron microscopy machine learning models. Microscopy and Microanalysis, 29, 1493–1494. [Google Scholar]
4. Kniesel, H. , Sick, L. , Payer, T. , Bergner, T. , Devan, K. S. , Read, C. , Walther, P. , Ropinski, T. , & Hermosilla, P. (2023). Weakly supervised virus capsid detection with image‐level annotations in electron microscopy images. In The Twelfth International Conference on Learning Representations , Wien, Austria. [Google Scholar]
5. Shaga Devan, K. , Walther, P. , von Einem, J. , Ropinski, T. , A Kestler, H. , & Read, C. (2021). Improved automatic detection of herpesvirus secondary envelopment stages in electron microscopy by augmenting training data with synthetic labelled images generated by a generative adversarial network. Cellular Microbiology, 23(2), e13280. [DOI] [PubMed] [Google Scholar]
6. Devan, K. S. , Walther, P. , von Einem, J. , Ropinski, T. , Kestler, H. A. , & Read, C. (2019). Detection of herpesvirus capsids in transmission electron microscopy images using transfer learning. Histochemistry and Cell Biology, 151, 101–114. [DOI] [PubMed] [Google Scholar]
7. Shaga Devan, K. , Kestler, H. A. , Read, C. , & Walther, P. (2022). Weighted average ensemble‐based semantic segmentation in biological electron microscopy images. Histochemistry and Cell Biology, 158(5), 447–462. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Khadangi, A. , Boudier, T. , & Rajagopal, V. (2021a). Em‐net: Deep learning for electron microscopy image segmentation. In 2020 25th International Conference on Pattern Recognition (ICPR) (pp. 31–38). IEEE. [Google Scholar]
9. Urakubo, H. , Bullmann, T. , Kubota, Y. , Oba, S. , & Ishii, S. (2019). UNI‐EM: An environment for deep neural network‐based automated segmentation of neuronal electron microscopic images. Scientific Reports, 9(1), 19413. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Guay, M. D. , Emam, Z. A. , Anderson, A. B. , Aronova, M. A. , Pokrovskaya, I. D. , Storrie, B. , & Leapman, R. D. (2021). Dense cellular segmentation for EM using 2D–3D neural network ensembles. Scientific Reports, 11(1), 2561. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Gallusser, B. , Maltese, G. , Di Caprio, G. , Vadakkan, T. J. , Sanyal, A. , Somerville, E. , Sahasrabudhe, M. , O'connor, J. , Weigert, M. , & Kirchhausen, T. (2022). Deep neural network automated segmentation of cellular structures in volume electron microscopy. Journal of Cell Biology, 222(2), e202208005. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. Hirabayashi, Y. , Iga, H. , Ogawa, H. , Tokuta, S. , Shimada, Y. , & Yamamoto, A. (2024). Deep learning for three‐dimensional segmentation of electron microscopy images of complex ceramic materials. npj Computational Materials, 10(1), 46. [Google Scholar]
13. Kniesel, H. , Ropinski, T. , Bergner, T. , Devan, K. S. , Read, C. , Walther, P. , Ritschel, T. , & Hermosilla, P. (2022). Clean implicit 3D structure from noisy 2D stem images. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 20762–20772). New Orleans, LA, USA. [Google Scholar]
14. Zhong, E. D. , Bepler, T. , Berger, B. , & Davis, J. H. (2021). CryoDRGN: reconstruction of heterogeneous cryo‐EM structures using neural networks. Nature Methods, 18(2), 176–185. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Gupta, H. , McCann, M. T. , Donati, L. , & Unser, M. (2021). CryoGAN: A new reconstruction paradigm for single‐particle cryo‐EM via deep adversarial learning. IEEE Transactions on Computational Imaging, 7, 759–774. [Google Scholar]
16. Zhong, E. D. , Lerer, A. , Davis, J. H. , & Berger, B. (2021). Cryodrgn2: Ab initio neural reconstruction of 3D protein structures from real cryo‐EM images. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 4066–4075). ICCV. [Google Scholar]
17. Shekarforoush, S. , Lindell, D. B. , Brubaker, M. A. , & Fleet, D. J. (2024). Improving ab‐initio cryo‐EM reconstruction with semi‐amortized pose inference. arXiv preprint arXiv:2406.10455 .
18. Zelenka, C. , Kamp, M. , Strohm, K. , Kadoura, A. , Johny, J. , Koch, R. , & Kienle, L. (2023). Automated classification of nanoparticles with various ultrastructures and sizes via deep learning. Ultramicroscopy, 246, 113685. [DOI] [PubMed] [Google Scholar]
19. Maksov, A. , Dyck, O. , Wang, K. , Xiao, K. , Geohegan, D. B. , Sumpter, B. G. , Vasudevan, R. K. , Jesse, S. , Kalinin, S. V. , & Ziatdinov, M. (2019). Deep learning analysis of defect and phase evolution during electron beam‐induced transformations in WS2. npj Computational Materials, 5(1), 12. [Google Scholar]
20. Belevich, I. , Joensuu, M. , Kumar, D. , Vihinen, H. , & Jokitalo, E. (2016). Microscopy image browser: A platform for segmentation and analysis of multidimensional datasets. PLoS Biology, 14(1), e1002340. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. von Chamier, L. , Laine, R. F. , Jukkala, J. , Spahn, C. , Krentzel, D. , Nehme, E. , Lerche, M. , Hernández‐Pérez, S. , Mattila, P. K. , Karinou, E. , Holden, S. , Solak, A. C. , Krull, A. , Buchholz, T. O. , Jones, M. L. , Royer, L. A. , Leterrier, C. , Shechtman, Y. , Jug, F. , … Henriques, R. (2021). Democratising deep learning for microscopy with zeroZeroCostDL4Mic. Nature Communications, 12(1), 2276. [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Yokoyama, Y. , Terada, T. , Shimizu, K. , Nishikawa, K. , Kozai, D. , Shimada, A. , Mizoguchi, A. , Fujiyoshi, Y. , & Tani, K. (2020). Development of a deep learning‐based method to identify “good” regions of a cryo‐electron microscopy grid. Biophysical Reviews, 12, 349–354. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Wang, F. , Gong, H. , Liu, G. , Li, M. , Yan, C. , Xia, T. , Li, X. , & Zeng, J. (2016). DeepPicker: A deep learning approach for fully automated particle picking in cryo‐EM. Journal of Structural Biology, 195(3), 325–336. [DOI] [PubMed] [Google Scholar]
24. Al‐Azzawi, A. , Ouadou, A. , Max, H. , Duan, Y. , Tanner, J. J. , & Cheng, J. (2020). DeepCryoPicker: Fully automated deep neural network for single protein particle picking in cryo‐EM. BMC Bioinformatics, 21, 1–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
25. Mettenleiter, T. C. (2004). Budding events in herpesvirus morphogenesis. Virus Research, 106(2), 167–180. [DOI] [PubMed] [Google Scholar]
26. Read, C. , Schauflinger, M. , Nikolaenko, D. , Walther, P. , & von Einem, J. (2019). Regulation of human cytomegalovirus secondary envelopment by a C‐terminal tetralysine motif in pUL71. Journal of Virology, 93(13), 10–1128. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Cappadona, I. , Villinger, C. , Schutzius, G. , Mertens, T. , & von Einem, J. (2015). Human cytomegalovirus pUL47 modulates tegumentation and capsid accumulation at the viral assembly complex. Journal of Virology, 89(14), 7314–7328. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Read, C. , Walther, P. , & von Einem, J. (2021). Quantitative electron microscopy to study HCMV morphogenesis. Methods in Molecular Biology (Clifton, N.J.), 2244, 265–289. [DOI] [PubMed] [Google Scholar]
29. He, K. , Zhang, X. , Ren, S. , & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778). IEEE. [Google Scholar]
30. Conrad, R. , & Narayan, K. (2021). CEM500K, a large‐scale heterogeneous unlabeled cellular electron microscopy image dataset for deep learning. Elife, 10, e65894. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Selvaraju, R. R. , Cogswell, M. , Das, A. , Vedantam, R. , Parikh, D. , & Batra, D. (2017). Grad‐CAM: Visual explanations from deep networks via gradient‐based localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 618–626). IEEE. [Google Scholar]
32. Lehtinen, J. , Munkberg, J. , Hasselgren, J. , Laine, S. , Karras, T. , Aittala, M. , & Aila, T. (2018). Noise2Noise: Learning image restoration without clean data. arXiv preprint arXiv:1803.04189 .
33. Batson, J. , & Royer, L. (2019). Noise2Self: Blind denoising by self‐supervision. In International Conference on Machine Learning (pp. 524–533). PMLR. [Google Scholar]
34. Krull, A. , Buchholz, T.‐O. , & Jug, F. (2019). Noise2Void‐learning denoising from single noisy images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2129–2137). IEEE. [Google Scholar]
35. Ulyanov, D. , Vedaldi, A. , & Lempitsky, V. (2018). Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 9446–9454). IEEE. [Google Scholar]
36. Mansour, Y. , & Heckel, R. (2023). Zero‐shot Noise2Noise: Efficient image denoising without any data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 14018–14027). IEEE. [Google Scholar]
37. Calvarons, A. F. (2021). Improved Noise2Noise denoising with limited data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (pp. 796–805). IEEE. [Google Scholar]
38. Buchholz, T.‐O. , Jordan, M. , Pigino, G. , & Jug, F. (2019). Cryo‐care: Content‐aware image restoration for cryo‐transmission electron microscopy data. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) (pp. 502–506). IEEE. [Google Scholar]
39. Bepler, T. , Kelley, K. , Noble, A. J. , & Berger, B. (2020). Topaz‐denoise: General deep denoising models for cryoEM and cryoET. Nature Communications, 11(1), 5208. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Lobato, I. , Friedrich, T. , & Van Aert, S. (2024). Deep convolutional neural networks to restore single‐shot electron microscopy images. npj Computational Materials, 10(1), 10. [Google Scholar]
41. Suveer, A. , Gupta, A. , Kylberg, G. , & Sintorn, I.‐M. (2019). Super‐resolution reconstruction of transmission electron microscopy images using deep learning. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) (pp. 548–551). IEEE. [Google Scholar]
42. Fang, L. , Monroe, F. , Novak, S. W. , Kirk, L. , Schiavon, C. R. , Yu, S. B. , Zhang, T. , Wu, M. , Kastner, K. , Latif, A. A. , Lin, Z. , Shaw, A. , Kubota, Y. , Mendenhall, J. , Zhang, Z. , Pekkurnaz, G. , Harris, K. , Howard, J. , & Manor, U. (2021). Deep learning‐based point‐scanning super‐resolution imaging. Nature Methods, 18(4), 406–416. [DOI] [PMC free article] [PubMed] [Google Scholar]
43. de Haan, K. , Ballard, Z. S. , Rivenson, Y. , Wu, Y. , & Ozcan, A. (2019). Resolution enhancement in scanning electron microscopy using deep learning. Scientific Reports, 9(1), 12050. [DOI] [PMC free article] [PubMed] [Google Scholar]
44. Ronneberger, O. , Fischer, P. , & Brox, T. (2015). U‐net: Convolutional networks for biomedical image segmentation. In Navab, N. , Hornegger, J. , Wells, W. , & Frangi, A. (Eds.), Medical Image Computing and Computer‐Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18 (pp. 234–241). Springer. [Google Scholar]
45. Khadangi, A. , Boudier, T. , & Rajagopal, V. (2021b). EM‐stellar: Benchmarking deep learning for electron microscopy image segmentation. Bioinformatics, 37(1), 97–106. [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Liu, G. , Niu, T. , Qiu, M. , Zhu, Y. , Sun, F. , & Yang, G. (2024). DeepETPicker: Fast and accurate 3D particle picking for cryo‐electron tomography using weakly supervised deep learning. Nature Communications, 15(1), 2090. [DOI] [PMC free article] [PubMed] [Google Scholar]
47. Deng, J. , Dong, W. , Socher, R. , Li, L.‐J. , Li, K. , & Fei‐Fei, L. (2009). Imagenet: A large‐scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). IEEE. [Google Scholar]
48. Powell, B. M. , & Davis, J. H. (2024). Learning structural heterogeneity from cryo‐electron sub‐tomograms with tomoDRGN. Nature Methods, 21, 1525–1536. [DOI] [PMC free article] [PubMed] [Google Scholar]
49. Mildenhall, B. , Srinivasan, P. P. , Tancik, M. , Barron, J. T. , Ramamoorthi, R. , & Ng, R. (2021). Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1), 99–106. [Google Scholar]
50. Radermacher, M. (2006). Weighted back‐projection methods. In Frank, J. (Ed.), Electron tomography: Methods for three‐dimensional visualization of structures in the cell (pp. 245–273). Springer. [Google Scholar]
51. Trampert, J. , & Leveque, J.‐J. (1990). Simultaneous iterative reconstruction technique: Physical interpretation based on the generalized least squares solution. Journal of Geophysical Research: Solid Earth, 95(B8), 12553–12559. [Google Scholar]
52. Google Research . (2023). Google Colaboratory. Retrieved from https://colab.research.google.com
53. Lightning AI . (2023). Lightning AI studios. Retrieved from https://lightning.ai/docs/studio
54. Sekachev, B. , Manovich, N. , Zhiltsov, M. , Zhavoronkov, A. , Kalinin, D. , Hoff, B. , TOsmanov, Kruchinin, D. , Zankevich, A. , DmitriySidnev, Markelov, M. , Johannes222, Chenuet, M. , a‐andre, telenachos, Melnikov, A. , Kim, J. , Ilouz, L. , … Truong, T. (2020). opencv/cvat: v1.1.0. Retrieved from 10.5281/zenodo.4009388 [DOI]

[jmi70005-bib-0001] 1. Gu, Y. , Shen, M. , Yang, J. , & Yang, G.‐Z. (2018). Reliable label‐efficient learning for biomedical image recognition. IEEE Transactions on Biomedical Engineering, 66(9), 2423–2432. [DOI] [PubMed] [Google Scholar]

[jmi70005-bib-0002] 2. Yakimovich, A. , Beaugnon, A. , Huang, Y. , & Ozkirimli, E. (2021). Labels in a haystack: Approaches beyond supervised learning in biomedical applications. Patterns, 2(12). [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0003] 3. Bruno, G. , Lynch, M. J. , Jacobs, R. , Morgan, D. D. , & Field, K. G. (2023). Evaluation of human‐bias in labeling of ambiguous features in electron microscopy machine learning models. Microscopy and Microanalysis, 29, 1493–1494. [Google Scholar]

[jmi70005-bib-0004] 4. Kniesel, H. , Sick, L. , Payer, T. , Bergner, T. , Devan, K. S. , Read, C. , Walther, P. , Ropinski, T. , & Hermosilla, P. (2023). Weakly supervised virus capsid detection with image‐level annotations in electron microscopy images. In The Twelfth International Conference on Learning Representations , Wien, Austria. [Google Scholar]

[jmi70005-bib-0005] 5. Shaga Devan, K. , Walther, P. , von Einem, J. , Ropinski, T. , A Kestler, H. , & Read, C. (2021). Improved automatic detection of herpesvirus secondary envelopment stages in electron microscopy by augmenting training data with synthetic labelled images generated by a generative adversarial network. Cellular Microbiology, 23(2), e13280. [DOI] [PubMed] [Google Scholar]

[jmi70005-bib-0006] 6. Devan, K. S. , Walther, P. , von Einem, J. , Ropinski, T. , Kestler, H. A. , & Read, C. (2019). Detection of herpesvirus capsids in transmission electron microscopy images using transfer learning. Histochemistry and Cell Biology, 151, 101–114. [DOI] [PubMed] [Google Scholar]

[jmi70005-bib-0007] 7. Shaga Devan, K. , Kestler, H. A. , Read, C. , & Walther, P. (2022). Weighted average ensemble‐based semantic segmentation in biological electron microscopy images. Histochemistry and Cell Biology, 158(5), 447–462. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0008] 8. Khadangi, A. , Boudier, T. , & Rajagopal, V. (2021a). Em‐net: Deep learning for electron microscopy image segmentation. In 2020 25th International Conference on Pattern Recognition (ICPR) (pp. 31–38). IEEE. [Google Scholar]

[jmi70005-bib-0009] 9. Urakubo, H. , Bullmann, T. , Kubota, Y. , Oba, S. , & Ishii, S. (2019). UNI‐EM: An environment for deep neural network‐based automated segmentation of neuronal electron microscopic images. Scientific Reports, 9(1), 19413. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0010] 10. Guay, M. D. , Emam, Z. A. , Anderson, A. B. , Aronova, M. A. , Pokrovskaya, I. D. , Storrie, B. , & Leapman, R. D. (2021). Dense cellular segmentation for EM using 2D–3D neural network ensembles. Scientific Reports, 11(1), 2561. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0011] 11. Gallusser, B. , Maltese, G. , Di Caprio, G. , Vadakkan, T. J. , Sanyal, A. , Somerville, E. , Sahasrabudhe, M. , O'connor, J. , Weigert, M. , & Kirchhausen, T. (2022). Deep neural network automated segmentation of cellular structures in volume electron microscopy. Journal of Cell Biology, 222(2), e202208005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0012] 12. Hirabayashi, Y. , Iga, H. , Ogawa, H. , Tokuta, S. , Shimada, Y. , & Yamamoto, A. (2024). Deep learning for three‐dimensional segmentation of electron microscopy images of complex ceramic materials. npj Computational Materials, 10(1), 46. [Google Scholar]

[jmi70005-bib-0013] 13. Kniesel, H. , Ropinski, T. , Bergner, T. , Devan, K. S. , Read, C. , Walther, P. , Ritschel, T. , & Hermosilla, P. (2022). Clean implicit 3D structure from noisy 2D stem images. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 20762–20772). New Orleans, LA, USA. [Google Scholar]

[jmi70005-bib-0014] 14. Zhong, E. D. , Bepler, T. , Berger, B. , & Davis, J. H. (2021). CryoDRGN: reconstruction of heterogeneous cryo‐EM structures using neural networks. Nature Methods, 18(2), 176–185. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0015] 15. Gupta, H. , McCann, M. T. , Donati, L. , & Unser, M. (2021). CryoGAN: A new reconstruction paradigm for single‐particle cryo‐EM via deep adversarial learning. IEEE Transactions on Computational Imaging, 7, 759–774. [Google Scholar]

[jmi70005-bib-0016] 16. Zhong, E. D. , Lerer, A. , Davis, J. H. , & Berger, B. (2021). Cryodrgn2: Ab initio neural reconstruction of 3D protein structures from real cryo‐EM images. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 4066–4075). ICCV. [Google Scholar]

[jmi70005-bib-0017] 17. Shekarforoush, S. , Lindell, D. B. , Brubaker, M. A. , & Fleet, D. J. (2024). Improving ab‐initio cryo‐EM reconstruction with semi‐amortized pose inference. arXiv preprint arXiv:2406.10455 .

[jmi70005-bib-0018] 18. Zelenka, C. , Kamp, M. , Strohm, K. , Kadoura, A. , Johny, J. , Koch, R. , & Kienle, L. (2023). Automated classification of nanoparticles with various ultrastructures and sizes via deep learning. Ultramicroscopy, 246, 113685. [DOI] [PubMed] [Google Scholar]

[jmi70005-bib-0019] 19. Maksov, A. , Dyck, O. , Wang, K. , Xiao, K. , Geohegan, D. B. , Sumpter, B. G. , Vasudevan, R. K. , Jesse, S. , Kalinin, S. V. , & Ziatdinov, M. (2019). Deep learning analysis of defect and phase evolution during electron beam‐induced transformations in WS2. npj Computational Materials, 5(1), 12. [Google Scholar]

[jmi70005-bib-0020] 20. Belevich, I. , Joensuu, M. , Kumar, D. , Vihinen, H. , & Jokitalo, E. (2016). Microscopy image browser: A platform for segmentation and analysis of multidimensional datasets. PLoS Biology, 14(1), e1002340. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0021] 21. von Chamier, L. , Laine, R. F. , Jukkala, J. , Spahn, C. , Krentzel, D. , Nehme, E. , Lerche, M. , Hernández‐Pérez, S. , Mattila, P. K. , Karinou, E. , Holden, S. , Solak, A. C. , Krull, A. , Buchholz, T. O. , Jones, M. L. , Royer, L. A. , Leterrier, C. , Shechtman, Y. , Jug, F. , … Henriques, R. (2021). Democratising deep learning for microscopy with zeroZeroCostDL4Mic. Nature Communications, 12(1), 2276. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0022] 22. Yokoyama, Y. , Terada, T. , Shimizu, K. , Nishikawa, K. , Kozai, D. , Shimada, A. , Mizoguchi, A. , Fujiyoshi, Y. , & Tani, K. (2020). Development of a deep learning‐based method to identify “good” regions of a cryo‐electron microscopy grid. Biophysical Reviews, 12, 349–354. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0023] 23. Wang, F. , Gong, H. , Liu, G. , Li, M. , Yan, C. , Xia, T. , Li, X. , & Zeng, J. (2016). DeepPicker: A deep learning approach for fully automated particle picking in cryo‐EM. Journal of Structural Biology, 195(3), 325–336. [DOI] [PubMed] [Google Scholar]

[jmi70005-bib-0024] 24. Al‐Azzawi, A. , Ouadou, A. , Max, H. , Duan, Y. , Tanner, J. J. , & Cheng, J. (2020). DeepCryoPicker: Fully automated deep neural network for single protein particle picking in cryo‐EM. BMC Bioinformatics, 21, 1–38. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0025] 25. Mettenleiter, T. C. (2004). Budding events in herpesvirus morphogenesis. Virus Research, 106(2), 167–180. [DOI] [PubMed] [Google Scholar]

[jmi70005-bib-0026] 26. Read, C. , Schauflinger, M. , Nikolaenko, D. , Walther, P. , & von Einem, J. (2019). Regulation of human cytomegalovirus secondary envelopment by a C‐terminal tetralysine motif in pUL71. Journal of Virology, 93(13), 10–1128. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0027] 27. Cappadona, I. , Villinger, C. , Schutzius, G. , Mertens, T. , & von Einem, J. (2015). Human cytomegalovirus pUL47 modulates tegumentation and capsid accumulation at the viral assembly complex. Journal of Virology, 89(14), 7314–7328. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0028] 28. Read, C. , Walther, P. , & von Einem, J. (2021). Quantitative electron microscopy to study HCMV morphogenesis. Methods in Molecular Biology (Clifton, N.J.), 2244, 265–289. [DOI] [PubMed] [Google Scholar]

[jmi70005-bib-0029] 29. He, K. , Zhang, X. , Ren, S. , & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778). IEEE. [Google Scholar]

[jmi70005-bib-0030] 30. Conrad, R. , & Narayan, K. (2021). CEM500K, a large‐scale heterogeneous unlabeled cellular electron microscopy image dataset for deep learning. Elife, 10, e65894. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0031] 31. Selvaraju, R. R. , Cogswell, M. , Das, A. , Vedantam, R. , Parikh, D. , & Batra, D. (2017). Grad‐CAM: Visual explanations from deep networks via gradient‐based localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 618–626). IEEE. [Google Scholar]

[jmi70005-bib-0032] 32. Lehtinen, J. , Munkberg, J. , Hasselgren, J. , Laine, S. , Karras, T. , Aittala, M. , & Aila, T. (2018). Noise2Noise: Learning image restoration without clean data. arXiv preprint arXiv:1803.04189 .

[jmi70005-bib-0033] 33. Batson, J. , & Royer, L. (2019). Noise2Self: Blind denoising by self‐supervision. In International Conference on Machine Learning (pp. 524–533). PMLR. [Google Scholar]

[jmi70005-bib-0034] 34. Krull, A. , Buchholz, T.‐O. , & Jug, F. (2019). Noise2Void‐learning denoising from single noisy images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2129–2137). IEEE. [Google Scholar]

[jmi70005-bib-0035] 35. Ulyanov, D. , Vedaldi, A. , & Lempitsky, V. (2018). Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 9446–9454). IEEE. [Google Scholar]

[jmi70005-bib-0036] 36. Mansour, Y. , & Heckel, R. (2023). Zero‐shot Noise2Noise: Efficient image denoising without any data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 14018–14027). IEEE. [Google Scholar]

[jmi70005-bib-0037] 37. Calvarons, A. F. (2021). Improved Noise2Noise denoising with limited data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (pp. 796–805). IEEE. [Google Scholar]

[jmi70005-bib-0038] 38. Buchholz, T.‐O. , Jordan, M. , Pigino, G. , & Jug, F. (2019). Cryo‐care: Content‐aware image restoration for cryo‐transmission electron microscopy data. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) (pp. 502–506). IEEE. [Google Scholar]

[jmi70005-bib-0039] 39. Bepler, T. , Kelley, K. , Noble, A. J. , & Berger, B. (2020). Topaz‐denoise: General deep denoising models for cryoEM and cryoET. Nature Communications, 11(1), 5208. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0040] 40. Lobato, I. , Friedrich, T. , & Van Aert, S. (2024). Deep convolutional neural networks to restore single‐shot electron microscopy images. npj Computational Materials, 10(1), 10. [Google Scholar]

[jmi70005-bib-0041] 41. Suveer, A. , Gupta, A. , Kylberg, G. , & Sintorn, I.‐M. (2019). Super‐resolution reconstruction of transmission electron microscopy images using deep learning. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) (pp. 548–551). IEEE. [Google Scholar]

[jmi70005-bib-0042] 42. Fang, L. , Monroe, F. , Novak, S. W. , Kirk, L. , Schiavon, C. R. , Yu, S. B. , Zhang, T. , Wu, M. , Kastner, K. , Latif, A. A. , Lin, Z. , Shaw, A. , Kubota, Y. , Mendenhall, J. , Zhang, Z. , Pekkurnaz, G. , Harris, K. , Howard, J. , & Manor, U. (2021). Deep learning‐based point‐scanning super‐resolution imaging. Nature Methods, 18(4), 406–416. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0043] 43. de Haan, K. , Ballard, Z. S. , Rivenson, Y. , Wu, Y. , & Ozcan, A. (2019). Resolution enhancement in scanning electron microscopy using deep learning. Scientific Reports, 9(1), 12050. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0044] 44. Ronneberger, O. , Fischer, P. , & Brox, T. (2015). U‐net: Convolutional networks for biomedical image segmentation. In Navab, N. , Hornegger, J. , Wells, W. , & Frangi, A. (Eds.), Medical Image Computing and Computer‐Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18 (pp. 234–241). Springer. [Google Scholar]

[jmi70005-bib-0045] 45. Khadangi, A. , Boudier, T. , & Rajagopal, V. (2021b). EM‐stellar: Benchmarking deep learning for electron microscopy image segmentation. Bioinformatics, 37(1), 97–106. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0046] 46. Liu, G. , Niu, T. , Qiu, M. , Zhu, Y. , Sun, F. , & Yang, G. (2024). DeepETPicker: Fast and accurate 3D particle picking for cryo‐electron tomography using weakly supervised deep learning. Nature Communications, 15(1), 2090. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0047] 47. Deng, J. , Dong, W. , Socher, R. , Li, L.‐J. , Li, K. , & Fei‐Fei, L. (2009). Imagenet: A large‐scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255). IEEE. [Google Scholar]

[jmi70005-bib-0048] 48. Powell, B. M. , & Davis, J. H. (2024). Learning structural heterogeneity from cryo‐electron sub‐tomograms with tomoDRGN. Nature Methods, 21, 1525–1536. [DOI] [PMC free article] [PubMed] [Google Scholar]

[jmi70005-bib-0049] 49. Mildenhall, B. , Srinivasan, P. P. , Tancik, M. , Barron, J. T. , Ramamoorthi, R. , & Ng, R. (2021). Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1), 99–106. [Google Scholar]

[jmi70005-bib-0050] 50. Radermacher, M. (2006). Weighted back‐projection methods. In Frank, J. (Ed.), Electron tomography: Methods for three‐dimensional visualization of structures in the cell (pp. 245–273). Springer. [Google Scholar]

[jmi70005-bib-0051] 51. Trampert, J. , & Leveque, J.‐J. (1990). Simultaneous iterative reconstruction technique: Physical interpretation based on the generalized least squares solution. Journal of Geophysical Research: Solid Earth, 95(B8), 12553–12559. [Google Scholar]

[jmi70005-bib-0052] 52. Google Research . (2023). Google Colaboratory. Retrieved from https://colab.research.google.com

[jmi70005-bib-0053] 53. Lightning AI . (2023). Lightning AI studios. Retrieved from https://lightning.ai/docs/studio

[jmi70005-bib-0054] 54. Sekachev, B. , Manovich, N. , Zhiltsov, M. , Zhavoronkov, A. , Kalinin, D. , Hoff, B. , TOsmanov, Kruchinin, D. , Zankevich, A. , DmitriySidnev, Markelov, M. , Johannes222, Chenuet, M. , a‐andre, telenachos, Melnikov, A. , Kim, J. , Ilouz, L. , … Truong, T. (2020). opencv/cvat: v1.1.0. Retrieved from 10.5281/zenodo.4009388 [DOI]

PERMALINK

DeepEM Playground: Bringing deep learning to electron microscopy labs

Hannah Kniesel

Poonam Poonam

Tristan Payer

Tim Bergner

Pedro Hermosilla

Timo Ropinski

Abstract

1. INTRODUCTION

FIGURE 1.

2. USE CASES

FIGURE 2.

2.1. Image to Value(s)

2.1.1. Application in EM

2.1.2. Example use case: explainable virus quantification

FIGURE 3.

2.2. Image to Image

2.2.1. Application in EM

2.2.2. Example use case: semantic segmentation of cellular structures

FIGURE 4.

2.3. 2D to 3D

2.3.1. Application in EM

2.3.2. Example use case: tomographic reconstruction

FIGURE 5.

2.4. Training setup

2.5. Community involvement

3. WORKFLOW

FIGURE 6.

3.1. Development

3.1.1. Data

Data acquisition

Data annotation

Data preprocessing

Data structuring

3.1.2. Model training

Hyperparameter tuning

Training and validation

3.1.3. Model evaluation

Choose model

Evaluation

3.2. Inference

Define data

Choose model

Make prediction

4. CONCLUSION

CONFLICT OF INTEREST STATEMENT

ACKNOWLEDGEMENTS

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases