Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Feb 26.
Published in final edited form as: J Struct Biol. 2025 May 14;217(2):108207. doi: 10.1016/j.jsb.2025.108207

AITom: AI-guided cryo-electron tomography image analyses toolkit

Xueying Zhan 1,1, Xiangrui Zeng 1,1, Mostofa Rafid Uddin 1, Min Xu 1,*
PMCID: PMC12934263  NIHMSID: NIHMS2140879  PMID: 40378936

Abstract

Cryo-electron tomography (cryo-ET) is an essential tool in structural biology, uniquely capable of visualizing three-dimensional macromolecular complexes within their native cellular environments, thereby providing profound molecular-level insights. Despite its significant promise, cryo-ET faces persistent challenges in the systematic localization, identification, segmentation, and structural recovery of three-dimensional subcellular components, necessitating the development of efficient and accurate large-scale image analysis methods. In response to these complexities, this paper introduces AITom, an open-source artificial intelligence platform tailored for cryo-ET researchers. AITom integrates a comprehensive suite of public and proprietary algorithms, supporting both traditional template-based and template-free approaches, alongside state-of-the-art deep learning methodologies for cryo-ET data analysis. By incorporating diverse computational strategies, AITom enables researchers to more effectively tackle the complexities inherent in cryo-ET, facilitating precise analysis and interpretation of complex biological structures. Furthermore, AITom provides extensive tutorials for each analysis module, offering valuable guidance to users in utilizing its comprehensive functionalities.

Keywords: Cryo-electron tomography, Computer vision, Machine learning, Image segmentation, Image classification

1. Introduction

Cryo-electron tomography (cryo-ET) is a powerful technique in structural biology that enables the visualization of three-dimensional macromolecular complexes within their native cellular environments at nano-resolution (Oikonomou and Jensen, 2017). This technique generates 2D projection images from cell samples, which are then reconstructed into a 3D image, known as a tomogram, using a series of tilt angles. Cryo-ET preserves the native structures and spatial organization of macromolecular complexes and cellular ultrastructures, such as membranes, within the cytoplasm. Despite its promise, significant challenges remain in transforming raw tomogram data into detailed structural information about these macromolecular complexes.

One major challenge in analyzing cryo-electron tomography data is overcoming image distortions caused by noise and the missing-wedge effect (Turk and Baumeister, 2020). The primary source of noise is the inherently low signal-to-noise ratio (SNR) in tomograms, exacerbated by the thickness of the cellular sample and the dense, heterogeneous nature of the cellular environment. This low SNR significantly hampers the accurate structural recovery of macromolecular complexes, as critical details are often obscured by the surrounding noise (Bepler et al., 2020). A cell sample in cryo-ET is always imaged within a limited tilt angle range, typically around ±60° to minimize damage from the electron beam. Consequently, due to incomplete tilting, Fourier back-projection based tomogram reconstruction contains a missing wedge in its Fourier space. The missing-wedge effect results in missing data in the reconstructed tomograms, making it challenging to accurately interpret certain regions of the sample structure (Hagen et al., 2017).

Addressing these issues requires sophisticated computational analysis methods, which focus on denoising tomograms, detecting subtomograms that contain macromolecules, and conducting subtomogram classification and segmentation. These computational techniques are not just tools for image correction; they are pivotal in transforming raw, noisy data into clear, interpretable structures, thereby enabling researchers to delve deeper into the understanding of complex biological mechanisms (Han et al., 2009). The development and refinement of these methods continue to be a crucial area of focus in the field of structural biology, as they directly impact our ability to accurately model and understand the intricate workings of cellular components at the molecular level.

Another significant challenge in cryo-electron tomography is the considerable number of cellular structures that remain unidentified by the scientific community. This gap in knowledge renders many reference-guided structural recovery techniques and pattern mining methods less effective, especially when these unknown structures are present. Traditional methods, reliant on existing structural data, struggle to adapt to these uncharted entities. Consequently, there is a growing necessity for template-free methods, specifically designed for cryo-ET analysis, to bridge this gap. Such methods are pivotal for autonomously discovering and analyzing a broader spectrum of features within cellular structures, enabling template-free averaging and classification of subtomograms (Xu et al., 2019). These advancements are crucial in understanding cellular architecture, particularly in novel or poorly understood macromolecular complexes.

Driven by the need for sophisticated computational methods tailored to the multifaceted tasks of cryo-ET data analysis, our collaborative effort has led to the development of an open-source artificial intelligence platform, AITom. This initiative stems from the integration of the open-source software TomoMiner (Frazier et al., 2017) and the innovative de novo Multi-Pattern Pursuit (MPP) framework (Xu et al., 2019) developed at Prof. Frank Alber’s lab (Alber, 2008). TomoMiner focused on (1) high-performance parallel computing to classify large-scale subtomograms (e.g., over 100,000); (2) emphasis on scalability and modular architecture using cloud-based approaches; and (3) specialized in reference-free and reference-based classification, subtomogram alignment, and template matching. Compared with TomoMiner, AITom not only incorporates high-performance parallelism from TomoMiner but integrates AI-driven algorithms, including deep learning models for both supervised and unsupervised learning (e.g., autoencoders, adversarial domain adaptation). Central to the functionality of AITom is its ability to process a reconstructed 3D tomogram and generate a detailed structural map, highlighting both detected and accurately recovered structures. To facilitate ease of use and enhance user engagement, AITom is equipped with comprehensive tutorials, guiding users through the nuances of each module. Emphasizing flexibility, the platform ensures that each module can operate independently given the appropriate input format. Additionally, AITom features a synergistic blend of scripting capabilities and a fundamental graphical user interface, empowering users to conduct interactive analysis remotely, thereby broadening the scope and accessibility of cryo-ET data analysis.

AITom distinguishes itself among cryo-ET platforms by integrating AI-driven approaches and excelling in subtomogram classification and segmentation, alongside its comprehensive pipeline for large-scale data analysis. For instance, while existing software like EMAN2 (Tang et al., 2007) and Relion (Burt et al., 2024) offer both template-based and template-free particle picking, AITom complements these functionalities with advanced segmentation and subtomogram classification methods. Using deep learning models and techniques like autoencoders, open-set learning, and domain adaptation, AITom enables robust classification and clustering of macromolecular structures, even in heterogeneous datasets. Additionally, AITom also incorporates unsupervised deep learning-based subtomogram alignment methods that surpass traditional tools like EMAN2 (Tang et al., 2007), Relion (Burt et al., 2024), and Dynamo (Castaño-Díez et al., 2012) by providing more accurate alignments and better handling of noise and structural heterogeneity.

Since 2017, deep learning methods have been proposed for cryo-ET data analysis and have garnered considerable attention (Gubins et al., 2019). These methods offer several advantages over traditional techniques, including significantly faster prediction and progressively improved model performance with the accumulation of big data. AITom incorporates innovative methods, leveraging recent advancements in deep learning and artificial intelligence for enhanced cryo-ET data analysis. This includes sophisticated approaches for particle picking, subtomogram classification and segmentation, and alignment and averaging, integrating state-of-the-art research findings directly into its toolset. These advancements allow AITom to tackle a broader spectrum of analytical tasks, offering unprecedented flexibility and power in the processing and interpreting of cryo-ET data.

The development of AITom significantly enriches cryo-ET data analysis. With its improved precision and flexibility, researchers can undertake more detailed and comprehensive structural analyses. This level of detail is crucial for understanding the complex interactions within macromolecular structures. Moreover, AITom‘s user-friendly interface and modular design make it adaptable to various research needs, facilitating a deeper exploration of biological data. As the platform continues to evolve, incorporating the latest scientific advancements, it stands to make substantial contributions to our understanding of molecular structures and their functions. AITom‘s ongoing development and expansion signify its potential as a pivotal tool for future breakthroughs in molecular biology research.

2. Results

2.1. Software implementation

AITom is a comprehensive platform encompassing an extensive range of programs, from tomogram-level preprocessing and particle picking to advanced subtomogram-level geometrical and deep learning methods. The foundation of these programs is primarily built on Python and C++, chosen for their robustness and adaptability in handling complex computational tasks. In the domain of deep learning-based methods, AITom skillfully leverages the combined strengths of Keras with a TensorFlow backend and PyTorch. This dual-framework setup offers unparalleled versatility in neural network implementation, capitalizing on Keras‘ user-friendly interface and TensorFlow‘s scalability, alongside PyTorch‘s dynamic computation graph and efficient memory usage. Such flexibility ensures that model development and training are efficient and adaptable to various research needs in cryo-ET.

Furthermore, AITom significantly extends its computational prowess by employing GPU-based computing in conjunction with traditional CPU parallel processing. This integration of GPU acceleration is crucial for surmounting the computationally intensive challenges typical in cryo-ET data analysis, such as the training of intricate deep learning models and the handling of extensive datasets. GPUs substantially increase processing speed and computational power, mitigating bottlenecks in data processing and analysis. As a result, AITom not only achieves enhanced computational performance but also facilitates more detailed and sophisticated analyses. Researchers using AITom can expect a marked improvement in efficiency and a reduction in processing times, allowing them to delve deeper into the complexities of molecular structures and interactions with greater precision and less computational overhead.

2.2. AITom analyzing programs

In the subsequent sections, we will detail the comprehensive functionalities of AITom, aligning them with the sequential steps involved in cryo-ET data processing. AITom is designed to offer a complete data processing workflow that includes a variety of critical tasks:

  • Data Preparation: Before diving into the core processing stages, AITom prioritizes the preparation of data, which is integral to the success of subsequent analyses. This crucial step involves not only the processing of real tomograms but also the simulation of tomograms and subtomograms. AITom is compatible with reconstructed 3D tomograms, typically generated using software like IMOD (Mastronarde and Held, 2017) or Wrap (Tegunov and Cramer, 2019), ensuring seamless integration into standard cryo-ET preprocessing workflows. Simulated data are especially vital during the training of deep learning-based approaches, as they provide a controlled environment to develop, test, and refine algorithms, ensuring robustness and efficacy when applied to real tomographic data.

  • Data Pre-processing: The next phase in the AITom‘s workflow is data pre-processing, which involves volume loading and displaying, estimating the signal-to-noise ratio (SNR) and missing wedge, and tomogram denoising.

  • Particle Picking: This step involves identifying and extracting particles from the tomograms, a critical task for detailed structural analysis.

  • Tomogram and Subtomogram Segmentation: These processes involve dividing the tomogram into meaningful segments or regions as mask representations, facilitating detailed structural analysis.

  • Subtomogram Clustering and Classification: This function groups similar subtomograms together and categorizes them, which is crucial for understanding the diversity of structures in the sample.

  • Subtomogram Alignment and Averaging: These processes align multiple subtomograms for a coherent structural representation and average them to enhance the resolution and clarity of the data.

Each stage represents a critical component of the cryo-ET data analysis pipeline, and AITom is equipped to handle them efficiently and precisely. We have summarized the workflow, as shown in Fig. 1 The following sections will provide a deeper understanding of these functionalities, illustrating how AITom facilitates a streamlined and effective analysis of cryo-ET data.

Fig. 1.

Fig. 1.

An example workflow of the AITom platform, showcasing its capabilities on a tomogram derived from primary rat neuron culture, as detailed in the study by Guo et al. (2018). Displayed from left to right are discernible patterns including the mitochondrial membrane, ellipsoid of strong signals, borders of ice crystals, TRiC-like structures, and ribosome-like patterns, alongside their respective spatial distributions within the sample. This figure highlights a subset of methods implemented in AITom.

2.2.1. Usage scenarios and data characteristics in Cryo-ET analysis

Cryo-ET tomograms exhibit varying noise levels, missing wedge artifacts, and structural complexity, each requiring tailored computational approaches. High-noise data necessitate aggressive denoising (e.g., anisotropic diffusion), while low-noise images benefit from contrast enhancement. Missing wedge distortions demand specific correction methods, and membrane-bound structures call for specialized segmentation and classification algorithms. Dense cellular environments complicate particle picking due to structural overlap, which is best addressed via machine learning, while sparse distributions favor template-based detection. For subtomogram-level analysis, subtomogram size further influences strategy: large volumes (>2563 px) capture complex assemblies but require scalable computation, whereas small volumes (<643 px) risk information loss and need optimized classification.

2.3. Data preparation – Simulation

In the field of cryo-ET, simulation plays a critical role, particularly since annotations for cryo-ET data are often limited and can affect the training of deep learning-based methods, which typically require large datasets. AITom addresses this challenge with two types of simulations: whole tomogram simulation and subtomogram simulation, each serving distinct purposes.

2.3.1. Whole tomogram simulation

AITom includes a robust framework for simulating whole tomograms, meticulously designed to mirror the crowded environments of cellular cytoplasm (Pei et al., 2016). This simulation process involves creating three-dimensional environments that replicate the complexity of actual cellular conditions, utilizing a variety of macromolecular complexes from the Protein Data Bank (Berman et al., 2000). The generated simulated tomograms, complete with realistic aspects such as noise, image distortions, and electron optical factors, are instrumental in testing and refining particle picking methods. These simulations provide a valuable setting that closely resembles real cryo-ET conditions, enabling the development and optimization of more precise and efficient particle picking algorithms.

2.3.2. Subtomogram simulation

Simulating subtomograms of crowded cellular environments is particularly advantageous for tasks such as subtomogram classification, helping in the development and refinement of algorithms for accurate identification and categorization of macromolecular structures within the complex cellular cytoplasm. AITom employs both traditional packing-based and deep learning Generative Adversarial Network (GAN)-based approaches for this purpose.

2.3.2.1. Packing-based approaches.

The packing-based method for generating subtomograms, as detailed in Liu et al. (2020a,b), starts by representing each macromolecular complex with a bounding sphere. These spheres, with assigned random copy numbers indicating their abundance, are randomly positioned within a volume. Molecular dynamics simulations and simulated annealing are then used to efficiently pack these spheres, avoiding overlaps and ensuring realistic density. The complexes are placed inside these spheres in random orientations to create a composite density map that simulates a crowded cellular environment. This map is then used to simulate the tomographic imaging process at varying SNR levels, producing realistic subtomograms for analysis.

2.3.2.2. GAN-based approaches.

CryoETGAN (Wu et al., 2022) is a GAN-based model designed for subtomogram generation. It learns the mapping functions between experimental subtomograms, which are densely packed 3D grayscale images, and density maps simulated from proteins. Comprising two generators and two discriminators, CryoETGAN is adept at capturing data distributions from both domains. The discriminators are trained to differentiate between experimental and generated samples, enabling CryoETGAN to learn mappings between unpaired data in these domains effectively. This approach is invaluable for generating realistic cryo-ET images, improving the capability of AITom in simulating detailed and accurate subtomograms. Fig. 2 displays subtomogram simulation results of applying CryoETGAN, and each section of the image represents a slice of the generated subtomogram data.

Fig. 2.

Fig. 2.

2D slide visualization of subtomograms generated using CryoETGAN, including: (1) Top: Proteasome, Ribosome, Tric, and Membrane; (2) Middle and Bottom: EMPIAR-10130&10131 (Rabbit muscle aldolase), EMPIAR-10133 (Glutamate dehydrogenase), EMPIAR-10135 (DNAB helicase), EMPIAR-10143 (T20S Proteasome), EMPIAR-10169 (Apoferritin), EMPIAR-10172 (Hemagglutinin), and EMPIAR-10173 (Insulin-bound insulin receptor).

Source: Reproduced from Wu et al. (2022) under the Creative Commons Attribution License (CC BY).

The packing-based approach offers a physically accurate simulation of crowded cellular environments, ideal for spatial analysis and particle-picking assessments. On the contrary, the GAN-based approach, exemplified by CryoETGAN, excels in image synthesis, providing a versatile tool for generating realistic subtomograms from limited or unpaired data. However, it may require careful training and validation to ensure biological accuracy.

2.4. Data pre-processing

2.4.1. Volume loading and displaying

In AITom, the primary requirement for input tomogram data is that they should be in the MRC file format (Crowther et al., 1996). For effective visualization of these tomograms, AITom utilizes the capabilities of the IMOD software (Kremer et al., 1996), a widely recognized tool in the field of electron microscopy for image processing and analysis. Furthermore, AITom integrates the Python package Matplotlib, a powerful library for creating static, interactive, and animated visualizations. This combination provides a robust platform for users to load, display, and interact with tomogram data, facilitating a detailed examination of the structures within.

2.4.2. SNR and missing wedge estimation

AITom addresses two primary sources of image distortion in cryo-ET data: noise and the missing wedge effect. Accurate estimation of these distortions is crucial for enhancing the quality of downstream data analysis and optimizing parameter selection. To estimate the SNR, as defined in Frank (2006), AITom employs a method based on the cross-correlation between pairs of aligned subtomograms that contain identical structures. This approach allows for a precise assessment of SNR in the data. For the missing wedge estimation, AITom uses the cross-correlation between a predefined missing wedge mask and the tomogram data that has been projected into Fourier space. This method effectively quantifies the missing wedge region, a common issue due to the limited tilt range in tomographic data acquisition, enabling more accurate reconstructions and analyses of the tomogram.

2.4.3. Denoising (a.k.a. Volume filtering)

Denoising is crucial in cryo-ET as it enhances the visibility of biological structures by reducing the high levels of noise inherent in cryo-ET images, thus improving the accuracy and reliability of subsequent structural analysis and interpretation (Bepler et al., 2020).

2.4.3.1. Gaussian denoising.

This method employs a Gaussian filter to smooth the data, effectively reducing noise. It operates by averaging pixel values with a Gaussian-weighted kernel, where the weights decrease with distance from the central pixel. This process blurs the image slightly, but significantly reduces random noise.

2.4.3.2. Bandpass filtering.

Bandpass filtering in cryo-ET is used to selectively emphasize specific frequency components in the data while attenuating others. This method enhances structural details by preserving mid-range spatial frequencies (e.g., around 1∕(10–50) Å) while suppressing low-frequency background and high-frequency noise via Fast Fourier Transform (FFT).

2.4.3.3. Anisotropic diffusion.

Anisotropic diffusion (Fernández and Li, 2003) is a more advanced technique that aims to reduce noise while preserving edges and important structural details in the image. It iteratively diffuses the image in a way that encourages smoothing within regions of similar intensity while inhibiting smoothing across edges. This method is particularly effective for images with a significant amount of noise, as it can selectively smooth areas of the image without blurring important structural boundaries.

2.4.4. Selection of denoising methods

The choice of a denoising strategy in AITom depends on both the noise level of the dataset and the degree of structural detail required. Anisotropic diffusion is particularly effective for highly noisy tomograms, as it reduces noise while preserving structural edges. This makes it well-suited for whole tomograms, larger subtomograms, and dense cellular regions. In contrast, Gaussian filtering offers a faster, more straightforward approach suitable for lower-noise tomograms or smaller subtomograms, striking a balance between smoothing and detail retention. Bandpass filtering enhances contrast by eliminating undesired frequency components, making it especially advantageous for membrane proteins or sparsely populated environments. Table 1 summarizes recommended AITom simulation methods for various data types.

Table 1.

Recommended AITom simulation, denoising, and particle picking methods for cryo-ET datasets.

Data scenario Simulation Denoising Particle picking

Whole Tomogram - High Noise Packing-based Anisotropic Diffusion DoG, Saliency Detection
Whole Tomogram - Low Noise Packing-based Gaussian Filtering DoG, Saliency Detection
Whole Tomogram - Membrane Proteins N/A Bandpass Filtering Saliency Detection, Faster R-CNN

Subtomogram - Large (>2563 px) Packing-based Anisotropic Diffusion N/A
Subtomogram - Small (<643 px) CryoETGAN Gaussian Filtering N/A

Dense Cellular Environment CryoETGAN Anisotropic Diffusion Faster R-CNN
Sparse Cellular Environment CryoETGAN Bandpass Filtering Saliency Detection

2.5. Particle picking

Particle picking is a crucial process in cryo-ET where small subvolumes, known as subtomograms, are extracted from a 3D tomogram to analyze individual structures. Automated algorithmic approaches are preferred given the labor-intensive nature of manual particle picking and the potential for bias based on individual biologists’ prior knowledge. Particle picking in AITom is categorized into two primary types: template-based and template-free.

2.5.1. Template matching (Template-based)

Template matching, a widely used method in template-based particle picking, calculates the cross-correlation between a structural template and a subvolume within the tomogram (Böhm et al., 2000). Since particles can be randomly oriented and located, template matching first generates various orientations of the structural template by rotation. The cross-correlation between these rotated templates and all locations in the tomogram is computed through convolution, with the highest value indicating the presence of the structure. AITom implements this method using a fast Fourier transform for efficient convolution (Nussbaumer, 2012). A threshold for cross-correlation is manually set to confirm the existence of structures at specific locations.

2.5.2. Difference of Gaussian (DoG) (Template-free)

The Difference of Gaussians (DoG) method, an approximation of the Laplacian of Gaussian (LoG), is adapted in AITom for 3D cryo-ET analysis (Pei et al., 2016). DoG identifies potential particles by detecting peaks in the difference between two Gaussian-filtered images with distinct standard deviations. To ensure accuracy, local density peaks are filtered based on a noise threshold and distance criterion to prevent multiple detections of a single particle. AITom provides adjustable parameters such as smoothing factors and minimum distance between detected locations, with IMOD software used to visualize detected particle locations and aid in parameter tuning (Kremer et al., 1996).

2.5.3. Object detection (Template-free)

To detect the objects of interest in cryo-ET images, we have integrated a method based on the Faster Region-based Convolutional Neural Network (Faster R-CNN) (Li et al., 2019) in AITom. This approach is adept at localizing and identifying mitochondria within the complex landscape of cellular cryo-ET images. The utility of Faster R-CNN lies in its ability to scan cryo-ET images efficiently, pinpointing the presence of mitochondria and encapsulating them within bounding boxes. Each detection is accompanied by a classification score quantifying the probability of the presence of a mitochondrion within these boxes. This method can be adapted to other organelles using corresponding annotated training datasets.

2.5.4. Saliency detection (Template-free)

Saliency detection, a machine learning approach, assesses the prominence of image subregions against their background (Zhu et al., 2014; Qin et al., 2015). In AITom, saliency detection is adapted for template-free particle picking in cryo-ET (Zhou et al., 2018). Using Robust-PCA (Candès et al., 2011), it separates background (low-rank matrix) from structural details (saliency matrix), enhancing feature extraction for accurate structure identification.

2.5.5. Post-picking particle refinement (Template-free)

The particle-picking process in cryo-electron tomography (cryo-ET) may inadvertently include false positives, necessitating a subsequent refinement step to enhance the quality of the particle images. This refinement is crucial for improving the homogeneity and resolution of the particles, which in turn facilitates a more accurate structural analysis. AITom employs simulated annealing (SA) (Shi et al., 2020) to refine particle selection from a heterogeneous set to establish a more homogeneous and high-quality subset by iteratively optimizing a resolution-based cost function. This process enhances data quality by filtering out low-confidence particles, reducing noise, and improving structural consistency. By optimizing particle selection, SA ensures higher-resolution reconstructions, contributing to more reliable structural analyses in cryo-ET.

2.5.6. Selection of different particle picking methods

Template-based particle picking is well-suited for datasets where a reference structure is available, offering high accuracy in high-contrast tomograms but performing poorly with flexible or unknown targets. In contrast, template-free methods accommodate diverse datasets without prior structural information. DoG is effective for detecting well-separated particles in high-contrast tomograms, while saliency detection is more appropriate for low-contrast regions and membrane-associated proteins. Faster R-CNN is particularly useful in dense cellular environments, where distinguishing individual particles is challenging, and simulated annealing refines initial selections by improving homogeneity and reducing false positives.

2.6. Subtomogram classification

The particle picking stage in cryo-electron tomography (cryo-ET) yields numerous subtomograms, which are crucial for downstream tasks such as segmentation, classification, alignment, and averaging. These steps are integral in unraveling intricate molecular and subcellular structures.

2.6.1. Fully supervised learning based models

AITom incorporates multiple sophisticated deep learning models to classify cryo-ET subtomogram data (Xu et al., 2017; Che et al., 2018; Zeng et al., 2023), each designed to tackle the complexities of 3D structural data:

  • DSR/3D-v2 (Deep Small Receptive /ield version 2): This model is a 3D adaptation of VGGNet (Simonyan and Zisserman, 2014), utilizing a Convolutional Neural Network (CNN) architecture. It features deeply stacked layers and compact 3D convolution filters of size 3 × 3 × 3. A substantial dropout rate of 70% improves the model’s ability to generalize to new data.

  • RB3D (Residual Block 3D): Based on the residual block concept (He et al., 2016), this model comprises four residual bottleneck blocks connected in sequence. Each block fuses two pathways at its termination, one for dimensionality reduction and restoration, and the other serving as a streamlined ‘shortcut’, employing a 50% dropout rate for generalization.

  • CB3D (3D convolutional (C3D)-based model): Inspired by the C3D framework (Tran et al., 2014), originally developed for large-scale supervised video datasets, CB3D processes 3D subtomograms as dynamic 2D entities, integrating max pooling layers within the convolutional structure and a 50% dropout to improve feature extraction and generalization.

  • AttPNet (Attention-based Point Network): Unlike conventional CNN-based models that process 3D volumetric images, AttPNet (Yang et al., 2020) transforms the data into 3D point sets, which preserves spatial characteristics while reducing computational complexity. It employs an attention mechanism with two branches: one generates attention masks to highlight informative regions, while the other extracts global features via convolution layers and a channel attention block. This structure enables AttPNet to capture fine structural variations and improve classification accuracy, particularly in noisy cryo-ET datasets.

  • YOPO (You Only Pool Once): YOPO is a distinctive CNN architecture (Zeng et al., 2023), specifically tailored for processing complex 3D datasets. YOPO adeptly handles the intricate task of extracting meaningful features from data with low SNR. Unlike traditional CNNs that progressively pool spatial information, YOPO performs pooling only once, preserving high-resolution structural details while ensuring transformation invariance. This design enables efficient feature extraction with minimal loss of spatial information, making it particularly effective for noisy, low-contrast cryo-ET data.

2.6.2. Models with limited supervision

The reliance on extensive labeled data in supervised learning models presents a significant challenge in subtomogram classification due to the laborious nature of manual labeling. AITom addresses this challenge by incorporating advanced machine learning strategies, including a few-shot learning, domain adaptation, domain randomization, active learning, and open-set learning to efficiently utilize limited labeled data.

2.6.2.1. Few-shot learning-based methods.

Few-shot learning is a sophisticated machine learning technique designed to enable effective learning and generalization from minimal data, often just a few examples per class (Wang et al., 2020). In the context of subtomogram classification, where acquiring a substantial quantity of labeled samples for novel macromolecular structures is difficult, AITom integrates ProtoNet-CE (Prototypical Networks - Combined Embedding) method (Li et al., 2020), which leverages both task-agnostic and task-specific embeddings, facilitating the classification of previously uncounted structures with a minimal set of labeled test samples. It enhances adaptability to new structural configurations, improving classification accuracy and efficiency in cryo-ET data.

2.6.2.2. Domain adaption.

AITom employs domain adaptation to improve model generalization when training on simulated cryo-ET data and applying it to experimental datasets. Due to differences in noise patterns and structural details (domain shift), models trained solely on synthetic data often perform poorly on real tomograms. To address this, AITom integrates adversarial domain adaptation (Lin et al., 2019), which aligns feature distributions between synthetic and experimental datasets. The process involves three stages: (1) training the classification network with simulated data; (2) using a Domain Discriminator to blur the distinction between features derived from simulated training data and those from testing data; and (3) deploying the refined features from the extractor for predictions on the testing data.

2.6.2.3. Domain randomization.

AITom applies domain randomization to improve generalization by artificially diversifying training data through randomized simulation parameters. Unlike domain adaptation, which aligns features between datasets, domain randomization (Che et al., 2019) exposes models to a broad range of variations—such as SNR, missing wedge angles, defocus, and spherical aberration—during training. This training strategy helps the models to better generalize to real cryo-ET data, effectively bridging the gap between simulation and reality, and improving the accuracy and robustness of macromolecule structure classification and segmentation in cryo-ET datasets.

2.6.2.4. Active learning.

Active learning is an approach in which the algorithm strategically selects the most informative data points for manual labeling, optimizing the training process with a limited number of labeled samples (Zhan et al., 2021, 2022). AITom integrates active learning to optimize subtomogram classification with minimal labeled data. It employs the Hybrid Active Learning (HAL) method (Du et al., 2021) that selects the most informative data points for manual annotation intelligently. This method synergistically combines uncertainty sampling, which focuses on predictions with the highest uncertainty, and discriminative sampling, where a discriminator is used to maintain a balanced distribution between labeled and unlabeled samples. HAL reduces labeling effort while preserving classification accuracy. Compared to passive learning approaches, it significantly enhances efficiency in cryo-ET data analysis, as shown in Fig. 3.

Fig. 3.

Fig. 3.

Illustration of the active learning approach in cryo-ET classification. (a) Traditional passive recognition pipeline requires extensive labeling; (b) HAL approach is guided by active learning.

Source: Reproduced from Du et al. (2021) under the Creative Commons Attribution License (CC BY).

2.6.2.5. Open-set learning.

Open-set noise in cryo-ET data refers to the presence of unknown or unseen macromolecular structures in the data, which are not represented in the training set of the machine learning models. This problem arises due to the vast diversity and complexity of cellular environments, where new and unknown structures are frequently encountered, making it challenging for models trained on limited datasets to accurately recognize and classify them. In AITom, we addressed the open-set noise problem in cryo-ET data by proposing the Soft Large Margin Centralized Cosine Loss (Soft LMCCL), a novel loss function for deep neural networks (Du et al., 2019). This approach enhances the model’s ability to recognize unseen macromolecular structures, thereby reducing the impact of open-set noise and improving the reliability of structure recognition in cryo-ET datasets.

2.6.3. Unsupervised learning based models

2.6.3.1. AutoEncoder3D.

In scenarios where labeled data is unavailable, unsupervised learning-based methods become essential. AITom employs a convolutional autoencoder for unsupervised subtomogram clustering, reducing the need for manual selection from large datasets (Zeng et al., 2018). It filters raw subtomograms by grouping them into fewer than 100 homogeneous clusters, significantly improving efficiency. To effectively group subtomograms containing the same structure in different orientations, it includes a pose normalization preprocessing step, normalizing the orientation and displacement of structures within each subtomogram. Clustering is then performed on the latent space encodings of subtomograms using the K-means. Moreover, when a combination of labeled and unlabeled subtomogram data is available, semi-supervised learning can enhance both supervised classification and unsupervised encoding learning. AITom implements a semi-supervised autoencoding classifier that integrates the above classification model with the autoencoder to learn both tasks simultaneously. The feature extraction layers are shared between the two tasks, allowing for mutual reinforcement of learning processes (Liu et al., 2019).

2.6.3.2. Harmony.

AITom integrates Harmony (Uddin et al., 2022), an unsupervised learning framework that disentangles semantic content from transformation factors in cryo-ET data. Using cross-contrastive learning and latent space decomposition, Harmony groups structurally similar subtomograms and improves classification accuracy. This capability makes Harmony a valuable tool for analyzing and classifying subtomograms in cryo-ET datasets, enabling researchers to better understand the structural heterogeneity of macromolecules in a noisy 3D environment.

2.6.3.3. DISCA.

DISCA (Deep Iterative Subtomogram Clustering Approach) performs unsupervised subtomogram clustering and has been integrated into AITom. The process commences with the utilization of YOPO for extracting features. Once YOPO processes the subtomograms to extract 3D structural features, these features are employed to group subtomograms into structurally similar clusters using GMM (Gaussian Mixture Models) (Figueiredo and Jain, 2002), all without the necessity for prior labeling or templates. The overall workflow of DISCA has been shown in Fig. 4. DISCA adopts an iterative refinement strategy: as more data are analyzed, the feature space is dynamically updated, facilitating the reassignment of subtomograms to their most fitting clusters based on structural similarities. This unsupervised and iterative methodology allows for the effective and autonomous discovery of diverse and potentially novel structural patterns in the biological samples under examination.

Fig. 4.

Fig. 4.

Workflow of DISCA exemplified on a Synechocystis cell. (a) 2D slice view of the template-free particle picking (DoG) on the raw tomogram. (b) Unsupervised training of the YOPO neural network by iteratively clustering extracted features. (c) Discovered patterns by DISCA re-embedded to the original tomogram space.

Source: Reproduced from Zeng et al. (2023) under the Creative Commons Attribution License (CC BY).

2.6.4. Subtomogram classification method selection

Subtomogram classification in cryo-ET depends on data availability and classification complexity. Supervised models (i.e., DSRF3D-v2, RB3D, CB3D, AttPNet, YOPO) are effective when large labeled datasets are available, extracting rich 3D features. Limited-supervision approaches (ProtoNet-CE, domain adaptation, domain randomization) improve generalization with minimal labels, while active learning (HAL) optimizes annotation efficiency and open-set learning (Soft LMCCL) enhances recognition of novel structures. Unsupervised methods (AutoEncoder3D, Harmony, DISCA) enable structural organization without labels, filtering noise, disentangling transformations, and clustering unknown macromolecular structures. To make a more direct comparison, we systematically summarize the architectures, optimization strategies, and performance characteristics of deep learning models in AITom (Table 2), highlighting their suitability for classification, segmentation, and structural analysis in cryo-ET.

Table 2.

Architectural details and performance characteristics of partial AITom deep learning models used in Cryo-ET for classification, segmentation, and feature extraction.

Model Type Layers Pooling Optimization Performance
YOPO CNN (single pooling) 3×3×3 conv, single pooling, FC Single pooling SGD/Adam, cross-entropy Low SNR robust, fast classification
DSRF3D-v2 3D VGGNet 3×3×3 conv, 2×2×2 pooling, FC 2×2×2 max pooling SGD (Nesterov 0.9) Feature extraction, reduces overfitting
RB3D Residual CNN Residual blocks, shortcut connections 2×2×2 max pooling SGD, cross-entropy Stable training, mitigates gradient loss
CB3D 3D CNN (C3D-based) Deep 3×3×3 conv, max pooling Max pooling (5 layers) ReLU, softmax High classification accuracy, robust to noise, missing wedge
AttPNet 3D PointNet Point-wise attention, CW-EdgeConv, feature masking No explicit pooling Adam, cross-entropy Enhances fine-structure recognition, robust to missing points and noise
AutoEncoder3D Encoder–Decoder Encoder–decoder, latent space 3D max pooling, upsampling K-means/GMM Extracts latent representations, enables clustering and structural organization
Harmony Disentanglement Encoder–decoder, feature disentanglement No explicit pooling Cross-contrastive, KL loss Invariant to transformations

We evaluated the classification performance of different models using the SHREC 2021 dataset (Gubins et al., 2021), which contains cryo-ET subtomograms of macromolecular complexes with varying molecular weights. The dataset includes small (< 200 kDa), medium (200–600 kDa), and large (> 600 kDa) particles, allowing for an assessment of classifier robustness across different size scales. For training, we randomly split the dataset into training, validation, and test sets with a ratio of 7 ∶ 1 ∶ 2. Each model was trained for 200 epochs to ensure convergence. The results in Table 3 show that classification accuracy generally increases with molecular weight, with large particles being easier to classify. RB3D achieved the highest overall accuracy (99.80%), demonstrating strong performance across all size categories. Small particles exhibited the most variability, with 2CG9 being particularly challenging for some models. YOPO and AutoEncoder(fully-supervised version) performed similarly (99.77%), while CB3D had slightly lower accuracy (99.65%), particularly for small and medium-sized particles. These findings highlight the importance of robust feature extraction for small particle classification in cryo-ET.

Table 3.

Fully-supervised subtomogram classification accuracy (%) of different AITom models, sorted by particle size. The highest accuracy for each particle is bolded.

PDB ID Name Molecular weight (kDa) AutoEncoder CB3D DSRF3D v2 RB3D YOPO

Small Particles (<200 kDa)
1S3X Hsp70 ATPase 42.75 100.00 99.68 100.00 99.36 99.68
3QM1 LJ0536 S106A 62.62 100.00 100.00 100.00 100.00 99.64
3GL1 Ssb1, Hsp70 84.61 100.00 100.00 100.00 100.00 100.00
3H84 GET3 158.08 100.00 100.00 100.00 100.00 100.00
2CG9 Hsp90-Sba1 188.73 97.82 97.45 97.82 100.00 99.64
Medium Particles (200–600 kDa)
3D2F Sse1p, Hsp70 236.11 100.00 100.00 100.00 100.00 99.03
1U6G Cand1-Cul1-Roc1 238.82 100.00 100.00 100.00 100.00 99.65
3CF3 P97/vcp 541.74 100.00 100.00 100.00 100.00 100.00
1BXN Rubisco 559.96 99.30 98.60 98.25 98.95 99.65
1QVR ClpB 593.36 100.00 100.00 100.00 99.68 100.00
Large Particles (>600 kDa)
4CR2 26S proteasome 1309.28 100.00 100.00 100.00 100.00 100.00
5MRC Yeast mito ribosome 3325.59 100.00 100.00 100.00 99.66 100.00
Overall Accuracy 99.77 99.65 99.68 99.80 99.77

2.7. Subtomogram segmentation

Segmentation of subtomograms in cryo-ET is a critical process to accurately isolate and examine specific structures or regions within a tomogram. This segmentation step is instrumental in addressing the challenges presented by the complexity and noise typical in cryo-ET data. Effective segmentation facilitates more precise biological interpretations, allowing for in-depth structural and functional analyses at the molecular and subcellular levels.

2.7.1. Supervised segmentation approaches

In the realm of subtomogram analysis, segmentation tasks typically succeed in particle picking steps. A notable method, as seen in Xu and Alber (2013) employs a template matching from AITom along with a recursive tracing algorithm (Rigort et al., 2012). This method produces a cross-correlation matrix and three template rotation matrices from the template matching process. These matrices feed into the tracing algorithm, yielding a binary mask that encapsulates the tracing results. The algorithm’s similarity function evaluates cross-correlation at a voxel derived from template matching, while incorporating smoothness, linearity, and distance coefficients for the subsequent voxel. Users can adjust thresholds and parameters to fine-tune the tracking results. The actin fiber segmentation method has also been included in AITom.

Segmentation tasks in subtomogram analysis often share methodologies with classification tasks due to overlapping data sources. For example, AttPNet (Yang et al., 2020) performs segmentation using a global attention module. This module generates a mask that accentuates each point’s importance within a 3D point set. The mask is applied by adding elements to reflect a segmentation network. This network then provides segmentation scores, effectively segmenting the point set based on the prominent features highlighted by the attention-driven mask.

2.7.2. Segmentation approaches with limited supervision

2.7.2.1. AutoEncoder3D.

A notable example of this is the application of AutoEncoder3D (Zeng et al., 2018), which extends its functionality beyond clustering and weakly supervised classification to facilitate semantic segmentation. The key process involves manually selecting and grouping clusters derived from the autoencoder. These grouped clusters are then utilized to train dense classifiers that perform voxel-level classification, leading to semantic segmentation of tomograms. This methodology significantly minimizes the need for labor-intensive, voxel-wise manual segmentation of 3D images. The only manual intervention required in this pipeline is the careful selection and grouping of image feature clusters, determined from a set of potential cluster centers decoded by the autoencoder. Thus, the segmentation process is characterized as weakly supervised, demanding minimal human involvement, and providing an efficient route for segmenting complex tomographic data.

2.7.2.2. COS-Net.

For subtomogram segmentation, AITom integrated the COS-Net (Cryo-ET One-Shot Network), a one-shot learning framework, which is capable of performing both classification and 3D segmentation of subtomograms (Zhou et al., 2020). COS-Net utilizes a Siamese network architecture with volume encoders, volume decoders, and feature encoders. In this setup, volume encoders extract the features of subtomograms, which are then transformed by feature encoders for one-shot learning. Concurrently, volume decoders generate the coarse attention/segmentation of the subtomograms. This approach allows COS-Net to address the classification and segmentation tasks in cryo-ET 3D imaging data.

2.7.2.3. CryoSAM.

AITom integrated an efficient, training-free method for segmenting CryoET tomograms called CryoSAM (Zhao et al., 2024). CryoSAM leverages existing 2D foundation models Segment Anything (SAM) (Kirillov et al., 2023) and presents a prompt-based segmentation approach that does not require supervised training with segmentation masks as training labels. CryoSAM is composed of (1) a prompt-based 3D segmentation system that uses prompts (i.e., particle coordinates) to complete single-particle instance segmentation recursively with Cross-Plane Self-Prompting, and (2) a Hierarchical Feature Matching mechanism that efficiently matches relevant features with extracted tomogram features. They collaborate to enable the segmentation of all particles of one category with just one particle-specific prompt. This method significantly reduces the need for manual annotations and improves efficiency and accuracy in identifying and segmenting particles within CryoET tomograms.

2.7.3. Unsupervised segmentation approaches

2.7.3.1. PUB-SalNet.

The scarcity of ground-truth masks in subtomogram segmentation tasks requires the use of unsupervised learning approaches. AITom incorporates PUB-SalNet (Pre-Trained Unsupervised Self-Aware Backpropagation Network for Biomedical Salient Segmentation) (Chen et al., 2020), an unsupervised technique originally developed for biomedical images and specifically validated for cryo-ET salient segmentation. PUB-SalNet employs a pre-trained deep feature extraction model in combination with the U-SalNet model, based on the U-Net architecture. PUB-SalNet selectively focuses on salient structures even in challenging cryo-ET data. A key feature of PUB-SalNet is its use of unsupervised attentional backpropagation, which allows for iterative refinement of the segmentation process. This enables PUB-SalNet to enhance its detection of salient regions with high accuracy, without any reliance on labeled training data. This robust design overcomes major hurdles in conventional segmentation tasks and offers an efficient, effective solution for cryo-ET analysis, proving invaluable for researchers working with limited annotated datasets.

2.7.4. Subtomogram segmentation method selection

Supervised segmentation approaches, such as AttPNet, provide high precision when labeled segmentation masks are available, enabling detailed precise voxel-level segmentation. Limited-supervision methods balance efficiency and accuracy, including AutoEncoder3D, COS-Net, and CryoSAM, and reduce reliance on annotated data by leveraging clustering, one-shot learning, and prompt-based segmentation to achieve efficient and scalable segmentation. Unsupervised methods, such as PUB-SalNet, address the scarcity of ground-truth labels by applying self-aware backpropagation and deep feature extraction to identify salient structures.

2.8. CNN model interpretation

As aforementioned, CNNs play a pivotal role in the AITom library, with notable examples including CB3D and YOPO. These CNN-based models ensure accuracy and reliability in automated analyses, particularly for identifying and segmenting complex macromolecular structures in 3D imaging data. The value of these models lies in their ability to provide insights into their decision-making processes. To enhance the interpretability and reliability of these complex CNN models, AITom has incorporated the Respond-CAM (Respond-weighted Class Activation Mapping) (Zhao et al., 2018a) feature. Respond-CAM highlights class-discriminative regions within 3D tomograms by generating heatmaps that visualize the most influential features in model predictions. The inclusion of Respond-CAM in AITom is particularly valuable for segmenting and classifying macromolecular structures in noisy cryo-ET data, improving the transparency and reliability of deep learning-based subtomogram analysis.

2.9. Subtomogram alignment and averaging

AITom integrates essential tasks for structural classification and recovery in cryo-ET, focusing on subtomogram alignment and averaging. These processes are vital for interpreting the intricate 3D structures obtained from cryo-ET data.

Subtomogram Alignment:

This process involves aligning a structural template with a subtomogram, or aligning a pair of subtomograms. The primary objective is to establish a geometric correspondence between them, which is critical for accurately analyzing and interpreting the 3D structural data obtained from cryo-ET.

Subtomogram Averaging:

Averaging multiple noisy subtomograms is a strategic approach to estimating the underlying structure. This method is particularly useful in scenarios where the SNR is low, and the individual subtomograms are not sufficiently clear to reveal detailed structural information. By averaging, it becomes possible to enhance the signal, reduce the noise, and thereby recover a more accurate representation of the underlying macromolecular structure.

These geometrical methods implemented in AITom play a vital role in extracting meaningful information from the complex data produced by cryo-ET. They enable researchers to align and average subtomograms efficiently, paving the way for deeper insights into the spatial arrangements and interactions of subcellular components.

2.9.1. Geometrical methods

Geometrical methods are indispensable in template-free cryo-ele ctron tomography (cryo-ET) analysis, particularly for tasks like subtomogram alignment and averaging. These methods form a core component of the AITom platform, facilitating the efficient processing and analysis of complex cryo-ET data.

2.9.1.1. Fast alignment.

Fast alignment in subtomogram analysis estimates the 3D rigid body geometric correspondence between a structural template and a subtomogram or between pairs of subtomograms. AITom incorporates three efficient subtomogram alignment methods (Xu et al., 2012; Xu and Alber, 2012; Lü et al., 2019), which utilize heuristics to expedite this computationally demanding process, in contrast to exhaustive search methods. Additionally, the known missing wedge mask is employed to mitigate the missing wedge effect through constrained cross-correlation (Förster et al., 2008), enhancing alignment accuracy under limited tilt angle imaging.

2.9.1.2. Alignment-based averaging.

In subtomogram averaging, potential biases are minimized by foregoing external structural templates. This approach, known as alignment-based subtomogram averaging (Briggs, 2013), involves aligning a collection of subtomograms with similar structures to their cumulative average and iteratively re-averaging them. The process continues until the resolution of the averaged subtomogram improves and stabilizes. In cases where the subtomograms are heterogeneously structured, this method segments them into homogeneous clusters for individual averaging. AITom facilitates this clustering and averaging process by incorporating a specialized framework that integrates dimension reduction, clustering, and the selection of optimal cluster cutoffs, allowing for a more streamlined and effective iterative averaging process (Xu et al., 2012).

2.9.1.3. Fast alignment maximum likelihood averaging.

The Fast Alignment Maximum Likelihood (FAML) method (Zhao et al., 2018b) represents an innovative integration of rapid alignment techniques with maximum likelihood-based averaging for efficient and robust subtomogram averaging. This method utilizes a fast alignment strategy (Xu et al., 2012), which calculates a series of sub-optimal rigid transformations within a translation-invariant boundary. Concurrently, it employs a maximum-likelihood approach (Scheres et al., 2009) that establishes a data model and employs an Expectation–Maximization (EM) algorithm for parameter updates. Notably, FAML has demonstrably enhanced the resolution of recovered structures when compared to solely fast alignment-based methods. Moreover, it achieves a remarkable speedup, ranging from 2 to 5 times faster than traditional maximum-likelihood approaches, all while maintaining resolution integrity.

2.9.2. Deep learning methods

AITom incorporated two deep learning methods for subtomogram alignment, named Gum-Net and Jim-Net. Both two are unsupervised learning methods.

2.9.2.1. Gum-Net.

The Geometric Unsupervised Matching Network (Gum-Net) (Zeng and Xu, 2020) is a notable development in the realm of deep learning for cryo-ET. Gum-Net is specifically engineered to establish geometric correspondence between two images, particularly for the tasks of 3D subtomogram alignment and averaging. This network boasts an end-to-end trainable architecture, comprising three innovative modules tailored to preserve feature spatial information and enhance the propagation of feature-matching information. One of the key strengths of Gum-Net is its fully unsupervised training approach, which optimizes a matching metric without the need for ground truth transformation information or any category-level or instance-level matching supervision. This aspect is especially crucial given the high levels of transformation variation and noise present in cryo-ET images.

2.9.2.2. Jim-Net.

Joint Image Alignment and Clustering Network (Jim-Net) (Zeng et al., 2021) represents a groundbreaking multi-task model in the field of cryo-ET. Jim-Net employs a unique architecture that shares feature extractors for both the clustering and alignment tasks of the source image. Its alignment branch utilizes a coarse-to-fine alignment strategy, where different feature extractors are used at various stages to propose and perform image transformations. This architecture includes fine image-transforming functions, which enable the network to propose and execute image transformations based on the features extracted from the source and target images. This approach allows Jim-Net to effectively align images while also clustering them based on their features in an unsupervised manner. An example of Jim-net has been presented in Fig. 5.

Fig. 5.

Fig. 5.

Example of isosurface representation of Jim-Net alignment on simulated SNR 100 dataset, including 3D grayscale heterogeneous structures (spliceosome, RNA polymeraserifampicin complex, RNA polymerase II elongation complex, ribosome, and capped proteasome). A random subtomogram from each cluster was chosen as the target subtomogram and the rest subtomograms from the same cluster were aligned to it.

Source: Reproduced from Zeng et al. (2021), ICCV 2021, available via open access through the Computer Vision Foundation (CVF).

2.9.3. Subtomogram alignment and averaging method selection

Geometrical methods like fast alignment, alignment-based averaging, and FAML provide efficient, template-free refinement. Fast alignment accelerates rigid-body transformations, alignment-based averaging iteratively improves resolution, and FAML integrates maximum-likelihood estimation for better accuracy. Deep learning methods, including Gum-Net and Jim-Net, offer unsupervised alignment, with Gum-Net optimizing feature matching and Jim-Net combining alignment and clustering. Geometrical methods suit high-throughput alignment, while deep learning models handle high-noise and heterogeneous data.

2.10. Highlights of AITom

AITom is a significant advancement in cryo-ET data analysis by integrating conventional image processing with deep learning based techniques. While existing tools such as IMOD (Kremer et al., 1996), EMAN2 (Tang et al., 2007), RELION (Scheres, 2012), PyTom (Hrabe et al., 2012), and Dynamo (Castaño-Díez et al., 2012) focus on alignment and averaging, AITom uniquely focuses more on subtomogram classification and segmentation, and attempt to address realistic challenges in cryo-ET analysis, including low SNR, structural heterogeneity, limited labeled data, and computational efficiency.

2.10.1. Low SNR and structural heterogeneity

Cryo-ET subtomograms are inherently noisy and structurally diverse. Traditional workflows address low SNR and heterogeneity through 3D classification, iterative alignment, and averaging, as implemented in RELION, Dynamo, and EMAN2. While these methods enhance contrast and resolution, they require large particle numbers, are susceptible to reference bias and containing false-positives, and often fail to capture rare conformations. In contrast, AITom leverages deep learning-based classification models such as YOPO and Harmony, which directly extract structural features from raw subtomograms and make 3D classification without iterative averaging. These models are inherently more robust to noise and better capture structural heterogeneity by learning intrinsic structural patterns beyond rigid template matching. This data-driven approach reduces dependence on large labeled datasets while improving sensitivity to rare or flexible macromolecular conformations, offering a more efficient and scalable alternative to traditional 3D classification.

2.10.2. Limited labeled data and model generalization

Unlike well-established fields such as X-ray crystallography or single-particle cryo-EM, cryo-ET lacks large, well-annotated training datasets, making fully supervised learning impractical for tasks such as denoising, 3D classification, and segmentation. AITom addresses this limitation, particularly in subtomogram classification and segmentation, through few-shot learning (ProtoNet-CE), domain adaptation, active learning (HAL), and prompt-based segmentation (CryoSAM). These methods enable models to generalize from minimal labeled data while promoting structured foundation models, ensuring robust performance across diverse experimental datasets. By reducing annotation requirements while preserving accuracy, AITom enhances the efficiency and scalability of AI-driven cryo-ET analysis in real-world applications.

2.10.3. Computational complexity and scalability

Cryo-ET produces large, complex datasets that require extensive computational processing across multiple stages, making high-throughput processing essential for efficiently handling increasing data volumes while maintaining accuracy. AITom is designed to support high-throughput cryo-ET analysis by integrating unsupervised learning and GPU-accelerated computation to reduce manual effort and computational overhead. DISCA automates subtomogram clustering for optimized refinement, while Harmony enhances structural grouping by disentangling content from transformation factors. Gum-Net and Jim-Net improve unsupervised alignment, reducing reliance on computationally expensive iterative approaches. Clustering-based refinement further streamlines processing, and GPU acceleration ensures scalable execution of deep learning models. By integrating these strategies, AITom enables high-throughput cryo-ET analysis, enhancing structural studies and accelerating macromolecular discovery.

2.11. Remote analysis with scripting and graphical hybrid

With the rapid growth of cryo-ET data, particularly for research laboratories. As the volume of data outpaces the capabilities of single workstations, there is a growing preference for storing and processing cryo-ET data on servers or computer clusters. These clusters, often shared among multiple users or laboratories, provide the necessary computational power and storage capacity to handle large datasets efficiently. AITom addresses this need by offering a hybrid approach that combines Python scripting with graphical user interface (GUI) elements, facilitated through Jupyter Notebook. This setup allows users to engage in interactive analysis remotely, overcoming the limitations of a single workstation setup. The Jupyter Notebook environment streamlines the process, enabling users to execute scripts and visualize results directly in their web browsers. This method ensures efficient data handling and processing, essential for complex cryo-ET analysis while maintaining user-friendly access and interaction with the data.

2.11.1. Example use case of remote analysis with scripting and graphical hybrid

This section illustrates the practical application of AITom for particle picking and classification, and reconstruction tasks. Utilizing the Difference of Gaussian (DoG) method (Pei et al., 2016), a popular choice for particle picking in single-particle tomography, AITom streamlines these tasks through its user-friendly interfaces. A notable feature of AITom is its combination of Python scripting and graphical interaction, enhancing data processing flexibility. This hybrid approach divides the workflow into manageable blocks, including data processing, result visualization, and manual selection. Each block can be fine-tuned based on intermediate outcomes, allowing for more precise control over the analysis process.

For immediate visualization, AITom offers a 2D display window equipped with interactive controls. A key parameter in this stage is the sigma of Gaussian filters, ideally set to match the size of the target particles. The particle picking results are visually represented, with identified particles marked by blue circles on each tomogram slice (refer to Fig. 6). Following the automated particle picking, users have the option for manual selection, further refining the analysis based on their expertise. The outcomes of this step are saved in a pickle file, with each particle location uniquely identified. The Autoencoder3D module in AITom then takes over for classification and reconstruction tasks. Default settings use 32×32×32 subtomograms as input. In the context of unsupervised learning, as described in Liu et al. (2019), the classification relies on k-means clustering, where the number of clusters is set according to the anticipated particle types. For reconstruction tasks, the cluster count is set to one. The training progress is dynamically displayed in the Jupyter Notebook, culminating in the visual presentation of reconstructed particles. This interactive and flexible approach makes AITom a powerful tool for cryo-ET data analysis (see Table 4).

Fig. 6.

Fig. 6.

This figure provides a comprehensive overview of particle picking results as displayed in a local browser. The interface marks detected particles with blue circles for easy identification. Key controls include ‘sigma’, which adjusts the Gaussian kernel size for denoising purposes, ‘R’, determining the radius of the blue circles, and ‘z’, which selects the specific slice to be displayed. The interface also offers a zoom-in feature for detailed examination of specific sections of the slice.

Table 4.

Comparison of AITom with existing cryo-ET analysis tools, organized by cryp-ET data processing steps.

Processing step AITom RELION EMAN2 Dynamo PyTom

1. Simulation
Tomogram Simulation Packing-Based No No No No
Subtomogram Simulation Packing-Based, GAN-Based No No No No

2. Raw Data Pre-Processing
Denoising Anisotropic Diffusion, etc No Gaussian Filter No No

3. Particle Picking
Automated Picking CNN-Based, DoG DoG, Manual + Template CNN-Based, DoG, Manual + Template Semi-Automated Template Template

4. Segmentation
Segmentation Deep Learning (CryoSAM, COS-Net) Not Available Not Available Not Available Partial Support (Manual Processing)

5. Subtomogram Classification
Classification Approach Deep Learning (YOPO, RB3D, CB3D) Iterative Alignment + Clustering Template Matching Hierarchical Clustering PCA-Based

6. Subtomogram Alignment
Alignment Method Deep Learning (Gum-Net, Jim-Net) Cross-Correlation-Based Cross-Correlation-Based Manual + Clustering Statistical-Based Alignment

7. Averaging & Structural Refinement
Subtomogram Averaging Fast Learning-Based Maximum Likelihood Template Matching Clustering + Averaging Statistical-Based
High-Resolution Refinement No Gold Standard Refinement Iterative Refinement Yes No

8. Computational Efficiency & Robustness
Computational Efficiency GPU-Accelerated (Fast Inference) CPU-Intensive (Slow) Optimized CPU-Based (Moderate) CPU-Intensive (Slow) Optimized CPU-Based (Moderate)
Low SNR Robustness High (Deep Learning Feature Extraction) Not Available Partial Support (Heavy Preprocessing) Not Available Not Available

3. Discussion

AITom represents a significant advancement in cryo-ET data analysis by integrating conventional image processing with state-of-the-art deep learning techniques. Unlike existing tools such as IMOD (Kremer et al., 1996), EMAN2 (Tang et al., 2007), RELION (Scheres, 2012), PyTom (Hrabe et al., 2012), and Dynamo (Castaño-Díez et al., 2012), which primarily focus on tomogram reconstruction, subtomogram averaging, template matching, and manual refinement, AITom uniquely offers an end-to-end deep learning framework for automated classification and segmentation of macromolecular structures.

One of AITom‘s key strengths lies in its deep learning-based subtomogram classification and segmentation modules, which enable direct particle classification without requiring iterative alignment and refinement. In contrast, traditional tools such as RELION and Dynamo rely on iterative alignment and clustering approaches for classification, which can be computationally expensive and sensitive to initial model selection. The benchmark results (Table 3) demonstrate that AITom‘s CNN models (YOPO, RB3D, CB3D, DSRF3D-v2) achieve near-perfect classification accuracy across diverse macromolecular structures, significantly outperforming conventional template-matching approaches. Furthermore, with GPU acceleration, AITom enables high-throughput classification, reducing computational time while maintaining robustness against low signal-to-noise ratio (SNR) conditions.

In addition to classification, AITom integrates deep-learning-based segmentation methods, including CryoSAM and COS-Net, which allow automated particle segmentation directly from tomograms—an area where existing tools provide little or no support. These segmentation models address challenges associated with heterogeneous macromolecular assemblies, structural flexibility, and crowded environments, where traditional template-based segmentation approaches often fail.

For subtomogram alignment, AITom introduces deep learning-based methods such as Gum-Net and Jim-Net, which provide data-driven alignment without the need for cross-correlation-based search methods used in RELION and EMAN2. This approach enables fast, robust alignment, particularly for highly heterogeneous datasets. In contrast, PyTom employs statistical-based alignment techniques, while Dynamo primarily relies on manual curation and clustering-based methods.

AITom also distinguishes itself in the domain of subtomogram averaging and structural refinement. While tools like RELION and EMAN2 employ maximum likelihood-based refinement for high-resolution reconstructions, AITom focuses on fast-learning-based subtomogram averaging, which is optimized for high-throughput processing. EMAN2, for instance, applies template matching for particle selection but refines averages using iterative alignment. Although RELION remains essential for achieving high-resolution refinement through gold-standard averaging, AITom complements these methods by accelerating the initial classification and alignment of particles, reducing the dependency on iterative processing workflows.

Computationally, AITom is optimized for GPU-accelerated inference, making it significantly more efficient for large-scale cryo-ET datasets compared to CPU-intensive tools such as RELION and Dynamo. EMAN2 and PyTom are more optimized for CPU-based computations, with moderate processing efficiency. The robustness of AITom under low-SNR conditions further strengthens its applicability to challenging cryo-ET datasets, reducing the need for extensive manual pre-processing.

The release of AITom‘s codebase marks the beginning of an ongoing effort to foster collaboration within the cryo-ET field. Our goal is to continually introduce new algorithms and machine-learning approaches that cater specifically to the needs of the cryo-ET research community. We invite the scientific community to engage with AITom, offer constructive feedback, and contribute to its development, thereby enriching the collective resource pool available for cryo-ET analysis.

Supplementary Material

supplementary document

Acknowledgments

We want to thank the following students and collaborators for their invaluable contributions to the developing of AITom. Zhenxi Zhu, Jie Jin, Sinuo Liu, Zhu Zhan, Haoran Wang, Xueyao Guo, Keting Zhao, Mingzhu Liu, Kolade Alabi, and Zhuyun Jin have contributed significantly to the codebase and the creation of comprehensive tutorials. Their efforts have been integral to advancing the platform and making it accessible to the cryo-ET research community. This work was supported in part by U.S. NIH grants R01GM134020 and P41GM103712, NSF grants DBI-1949629, DBI-2238093, IIS-2007595, IIS-2211597, and MCB-2205148. This work was partly supported by UPMC Enterprises, Oracle Cloud credits and related resources provided by Oracle for Research, and the computational resources support from AMD HPC Fund. XZ and MRU were supported in part by a fellowship from CMU CMLH.

Appendix A. Supplementary data

Supplementary material related to this article can be found online at https://doi.org/10.1016/j.jsb.2025.108207.

Footnotes

Declaration of competing interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Min Xu reports financial support was provided by Carnegie Mellon University. Xueying Zhan reports financial support was provided by Carnegie Mellon University. Mostofa Rafid Uddin reports financial support was provided by Carnegie Mellon University. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

CRediT authorship contribution statement

Xueying Zhan: Writing – review & editing, Writing – original draft, Software. Xiangrui Zeng: Writing – review & editing, Supervision, Software, Resources, Methodology. Mostofa Rafid Uddin: Writing – review & editing, Methodology. Min Xu: Supervision, Software, Project administration, Funding acquisition.

Data availability

The authors are unable or have chosen not to specify which data has been used.

References

  1. Alber Frank, 2008. Alber lab. http://web.cmb.usc.edu/people/alber/index.htm.
  2. Bepler Tristan, Kelley Kotaro, Noble Alex J., Berger Bonnie, 2020. Topaz-denoise: general deep denoising models for cryoEM and cryoET. Nat. Commun. 11 (1), 5208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Berman Helen M., Westbrook John, Feng Zukang, Gilliland Gary, Bhat Talapady N., Weissig Helge, Shindyalov Ilya N., Bourne Philip E., 2000. The protein data bank. Nucleic Acids Res. 28 (1), 235–242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Böhm Jochen, Frangakis Achilleas S., Hegerl Reiner, Nickell Stephan, Typke Dieter, Baumeister Wolfgang, 2000. Toward detecting and identifying macromolecules in a cellular context: template matching applied to electron tomograms. Proc. Natl. Acad. Sci. 97 (26), 14245–14250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Briggs John A.G., 2013. Structural biology in situ—the potential of subtomogram averaging. Curr. Opin. Struct. Biol. 23 (2), 261–267. [DOI] [PubMed] [Google Scholar]
  6. Burt Alister, Toader Bogdan, Warshamanage Rangana, von Kügelgen Andriko, Pyle Euan, Zivanov Jasenko, Kimanius Dari, Bharat Tanmay A.M., Scheres Sjors H.W., 2024. An image processing pipeline for electron cryo-tomography in RELION-5. FEBS Open Bio 14 (11), 1788–1804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Candès Emmanuel J., Li Xiaodong, Ma Yi, Wright John, 2011. Robust principal component analysis? J. ACM 58 (3), 11. [Google Scholar]
  8. Castaño-Díez Daniel, Kudryashev Mikhail, Arheit Marcel, Stahlberg Henning, 2012. Dynamo: a flexible, user-friendly development tool for subtomogram averaging of cryo-EM data in high-performance computing environments. J. Struct. Biol. 178 (2), 139–151. [DOI] [PubMed] [Google Scholar]
  9. Che Chengqian, Lin Ruogu, Zeng Xiangrui, Elmaaroufi Karim, Galeotti John, Xu Min, 2018. Improved deep learning-based macromolecules structure classification from electron cryo-tomograms. Mach. Vis. Appl. 29 (8), 1227–1236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Che Chengqian, Xian Zhou, Zeng Xiangrui, Gao Xin, Xu Min, 2019. Domain randomization for macromolecule structure classification and segmentation in electron cyro-tomograms. In: 2019 IEEE International Conference on Bioinformatics and Biomedicine. BIBM, IEEE, pp. 6–11. [Google Scholar]
  11. Chen Feiyang, Jiang Ying, Zeng Xiangrui, Zhang Jing, Gao Xin, Xu Min, 2020. PUB-SalNet: A pre-trained unsupervised self-aware backpropagation network for biomedical salient segmentation. Algorithms 13 (5), 126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Crowther RA, Henderson Richard, Smith John M., 1996. MRC image processing programs. J. Struct. Biol. 116 (1), 9–16. [DOI] [PubMed] [Google Scholar]
  13. Du Xuefeng, Wang Haohan, Zhu Zhenxi, Zeng Xiangrui, Chang Yi-Wei, Zhang Jing, Xing Eric, Xu Min, 2021. Active learning to classify macromolecular structures in situ for less supervision in cryo-electron tomography. Bioinformatics 37 (16), 2340–2346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Du Xuefeng, Zeng Xiangrui, Zhou Bo, Singh Alex, Xu Min, 2019. Open-set recognition of unseen macromolecules in cellular electron cryo-tomograms by soft large margin centralized cosine loss. In: BMVC. p. 148. [Google Scholar]
  15. Fernández José-Jesús, Li Sam, 2003. An improved algorithm for anisotropic nonlinear diffusion for denoising cryo-tomograms. J. Struct. Biol. 144 (1–2), 152–161. [DOI] [PubMed] [Google Scholar]
  16. Figueiredo Mario A.T., Jain Anil K., 2002. Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 24 (3), 381–396. [Google Scholar]
  17. Förster Friedrich, Pruggnaller Sabine, Seybert Anja, Frangakis Achilleas S., 2008. Classification of cryo-electron sub-tomograms using constrained correlation. J. Struct. Biol. 161 (3), 276–286. [DOI] [PubMed] [Google Scholar]
  18. Frank Joachim, 2006. Three-Dimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Biological Molecules in Their Native State. Oxford University Press. [Google Scholar]
  19. Frazier Zachary, Xu Min, Alber Frank, 2017. Tomominer and tomominercloud: a software platform for large-scale subtomogram structural analysis. Structure 25 (6), 951–961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gubins Ilja, Chaillet Marten L., Schot Gijs van der, Trueba M. Cristina, Veltkamp Remco C., Förster Friedrich, Wang Xiao, Kihara Daisuke, Moebel Emmanuel, Nguyen Nguyen P., White Tommi, Bunyak Filiz, Papoulias Giorgos, Gerolymatos Stavros, Zacharaki Evangelia I., Moustakas Konstantinos, Zeng Xiangrui, Liu Sinuo, Xu Min, Wang Yaoyu, Chen Cheng, Cui Xuefeng, Zhang Fa, 2021. SHREC 2021: Classification in Cryo-electron Tomograms. In: Biasotti Silvia, Dyke Roberto M., Lai Yukun, Rosin Paul L., Veltkamp Remco C. (Eds.), Eurographics Workshop on 3D Object Retrieval. The Eurographics Association. [Google Scholar]
  21. Gubins Ilja, van Der Schot Gijs, Veltkamp Remco C., Förster FG, Du Xuefeng, Zeng Xiangrui, Zhu Zhenxi, Chang Lufan, Xu Min, Moebel Emmanuel, et al. , 2019. Classification in cryo-electron tomograms. SHREC’ 19 Track. [Google Scholar]
  22. Guo Qiang, Lehmer Carina, Martínez-Sánchez Antonio, Rudack Till, Beck Florian, Hartmann Hannelore, Pérez-Berlanga Manuela, Frottin Frédéric, Hipp Mark S., Hartl F. Ulrich, et al. , 2018. In situ structure of neuronal C9orf72 poly-GA aggregates reveals proteasome recruitment. Cell 172 (4), 696–705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hagen Wim J.H., Wan William, Briggs John A.G., 2017. Implementation of a cryo-electron tomography tilt-scheme optimized for high resolution subtomogram averaging. J. Struct. Biol. 197 (2), 191–198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Han Bong-Gyoon, Dong Ming, Liu Haichuan, Camp Lauren, Geller Jil, Singer Mary, Hazen Terry C., Choi Megan, Witkowska H. Ewa, Ball David A., et al. , 2009. Survey of large protein complexes in D. vulgaris reveals great structural diversity. Proc. Natl. Acad. Sci. 106 (39), 16580–16585. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. He Kaiming, Zhang Xiangyu, Ren Shaoqing, Sun Jian, 2016. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 770–778. [Google Scholar]
  26. Hrabe Thomas, Chen Yuxiang, Pfeffer Stefan, Cuellar Luis Kuhn, Mangold Ann-Victoria, Förster Friedrich, 2012. PyTom: a Python-based toolbox for localization of macromolecules in cryo-electron tomograms and subtomogram analysis. J. Struct. Biol. 178 (2), 177–188. [DOI] [PubMed] [Google Scholar]
  27. Kirillov Alexander, Mintun Eric, Ravi Nikhila, Mao Hanzi, Rolland Chloe, Gustafson Laura, Xiao Tete, Whitehead Spencer, Berg Alexander C., Lo Wan-Yen, et al. , 2023. Segment anything. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4015–4026. [Google Scholar]
  28. Kremer James R., Mastronarde David N., McIntosh J. Richard, 1996. Computer visualization of three-dimensional image data using IMOD. J. Struct. Biol. 116 (1), 71–76. [DOI] [PubMed] [Google Scholar]
  29. Li Ran, Yu Liangyong, Zhou Bo, Zeng Xiangrui, Wang Zhenyu, Yang Xiaoyan, Zhang Jing, Gao Xin, Jiang Rui, Xu Min, 2020. Few-shot learning for classification of novel macromolecular structures in cryo-electron tomograms. PLoS Comput. Biol. 16 (11), e1008227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Li Ran, Zeng Xiangrui, Sigmund Stephanie E., Lin Ruogu, Zhou Bo, Liu Chang, Wang Kaiwen, Jiang Rui, Freyberg Zachary, Lv Hairong, et al. , 2019. Automatic localization and identification of mitochondria in cellular electron cryo-tomography using faster-RCNN. BMC Bioinformatics 20 (3), 75–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Lin Ruogu, Zeng Xiangrui, Kitani Kris, Xu Min, 2019. Adversarial domain adaptation for cross data source macromolecule in situ structural classification in cellular electron cryo-tomograms. Bioinformatics 35 (14), i260–i268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Liu Sinuo, Ban Xiaojuan, Zeng Xiangrui, Zhao Fengnian, Gao Yuan, Wu Wenjie, Zhang Hongpan, Chen Feiyang, Hall Thomas, Gao Xin, et al. , 2020b. A unified framework for packing deformable and non-deformable subcellular structures in crowded cryo-electron tomogram simulation. BMC Bioinformatics 21 (1), 1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Liu Siyuan, Du Xuefeng, Xi Rong, Xu Fuya, Zeng Xiangrui, Zhou Bo, Xu Min, 2019. Semi-supervised macromolecule structural classification in cellular electron cryo-tomograms using 3D autoencoding classifier.. In: BMVC. vol. 30. [Google Scholar]
  34. Liu Sinuo, Ma Yan, Ban Xiaojuan, Zeng Xiangrui, Nallapareddy Vamsi, Chaudhari Ajinkya, Xu Min, 2020a. Efficient cryo-electron tomogram simulation of macromolecular crowding with application to SARS-CoV-2. In: 2020 IEEE International Conference on Bioinformatics and Biomedicine. BIBM, IEEE, pp. 80–87. [Google Scholar]
  35. Lü Yongchun, Zeng Xiangrui, Zhao Xiaofang, Li Shirui, Li Hua, Gao Xin, Xu Min, 2019. Fine-grained alignment of cryo-electron subtomograms based on MPI parallel optimization. BMC Bioinformatics 20 (1), 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Mastronarde David N., Held Susannah R., 2017. Automated tilt series alignment and tomographic reconstruction in IMOD. J. Struct. Biol. 197 (2), 102–113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Nussbaumer Henri J., 2012. Fast Fourier Transform and Convolution Algorithms. vol. 2, Springer Science & Business Media. [Google Scholar]
  38. Oikonomou Catherine M., Jensen Grant J., 2017. Cellular electron cryotomography: toward structural biology in situ. Annu. Rev. Biochem. 86. [DOI] [PubMed] [Google Scholar]
  39. Pei Long, Xu Min, Frazier Zachary, Alber Frank, 2016. Simulating cryo electron tomograms of crowded cell cytoplasm for assessment of automated particle picking. BMC Bioinformatics 17 (1), 405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Qin Yao, Lu Huchuan, Xu Yiqun, Wang He, 2015. Saliency detection via cellular automata. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 110–119. [Google Scholar]
  41. Rigort Alexander, Günther David, Hegerl Reiner, Baum Daniel, Weber Britta, Prohaska Steffen, Medalia Ohad, Baumeister Wolfgang, Hege Hans-Christian, 2012. Automated segmentation of electron tomograms for a quantitative description of actin filament networks. J. Struct. Biol. 177 (1), 135–144. [DOI] [PubMed] [Google Scholar]
  42. Scheres Sjors H.W., 2012. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180 (3), 519–530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Scheres Sjors H.W., Melero Roberto, Valle Mikel, Carazo Jose-Maria, 2009. Averaging of electron subtomograms and random conical tilt reconstructions through likelihood optimization. Structure 17 (12), 1563–1572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Shi Jie, Zeng Xiangrui, Jiang Rui, Jiang Tao, Xu Min, 2020. A simulated annealing approach for resolution guided homogeneous cryo-electron microscopy image selection. Quant. Biology 8 (1), 51–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Simonyan Karen, Zisserman Andrew, 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. [Google Scholar]
  46. Tang Guang, Peng Liwei, Baldwin Philip R., Mann Deepinder S., Jiang Wen, Rees Ian, Ludtke Steven J., 2007. EMAN2: an extensible image processing suite for electron microscopy. J. Struct. Biol. 157 (1), 38–46. [DOI] [PubMed] [Google Scholar]
  47. Tegunov Dimitry, Cramer Patrick, 2019. Real-time cryo-electron microscopy data preprocessing with warp. Nature Methods 16 (11), 1146–1152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Tran Du, Bourdev Lubomir D., Fergus Rob, Torresani Lorenzo, Paluri Manohar, 2014. C3D: generic features for video analysis. CoRR, Abs/1412.0767 2 (7), 8. [Google Scholar]
  49. Turk Martin, Baumeister Wolfgang, 2020. The promise and the challenges of cryo-electron tomography. FEBS Lett. 594 (20), 3243–3261. [DOI] [PubMed] [Google Scholar]
  50. Uddin Mostofa Rafid, Howe Gregory, Zeng Xiangrui, Xu Min, 2022. Harmony: A generic unsupervised approach for disentangling semantic content from parameterized transformations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 20646–20655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Wang Yaqing, Yao Quanming, Kwok James T., Ni Lionel M., 2020. Generalizing from a few examples: A survey on few-shot learning. ACM Comput. Surv. (Csur) 53 (3), 1–34. [Google Scholar]
  52. Wu Xindi, Li Chengkun, Zeng Xiangrui, Wei Haocheng, Deng Hong-Wen, Zhang Jing, Xu Min, 2022. CryoETGAN: Cryo-electron tomography image synthesis via unpaired image translation. Front. Physiol. 13, 760404. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Xu Min, Alber Frank, 2012. High precision alignment of cryo-electron subtomograms through gradient-based parallel optimization. BMC Syst. Biology 6 (1), S18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Xu Min, Alber Frank, 2013. Automated target segmentation and real space fast alignment methods for high-throughput classification and averaging of crowded cryo-electron subtomograms. Bioinformatics 29 (13), i274–i282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Xu Min, Beck Martin, Alber Frank, 2012. High-throughput subtomogram alignment and classification by Fourier space constrained fast volumetric matching. J. Struct. Biol. 178 (2), 152–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Xu Min, Chai Xiaoqi, Muthakana Hariank, Liang Xiaodan, Yang Ge, Zeev-Ben-Mordehai Tzviya, Xing Eric P., 2017. Deep learning-based subdivision approach for large scale macromolecules structure recovery from electron cryo tomograms. Bioinformatics 33 (14), i13–i22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Xu Min, Singla Jitin, Tocheva Elitza I., Chang Yi-Wei, Stevens Raymond C., Jensen Grant J., Alber Frank, 2019. De novo structural pattern mining in cellular electron cryotomograms. Structure 27 (4), 679–691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Yang Yufeng, Ma Yixiao, Zhang Jing, Gao Xin, Xu Min, 2020. AttPNet: Attention-based deep neural network for 3D point set analysis. Sensors 20 (19), 5455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Zeng Xiangrui, Howe Gregory, Xu Min, 2021. End-to-end robust joint unsupervised image alignment and clustering. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 3854–3866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Zeng Xiangrui, Kahng Anson, Xue Liang, Mahamid Julia, Chang Yi-Wei, Xu Min, 2023. High-throughput cryo-ET structural pattern mining by unsupervised deep iterative subtomogram clustering. Proc. Natl. Acad. Sci. 120 (15), e2213149120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Zeng Xiangrui, Leung Miguel Ricardo, Zeev-Ben-Mordehai Tzviya, Xu Min, 2018. A convolutional autoencoder approach for mining features in cellular electron cryo-tomograms and weakly supervised coarse segmentation. J. Struct. Biol. 202 (2), 150–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Zeng Xiangrui, Xu Min, 2020. Gum-Net: Unsupervised geometric matching for fast and accurate 3D subtomogram image alignment and averaging. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4073–4084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Zhan Xueying, Liu Huan, Li Qing, Chan Antoni B., 2021. A comparative survey: Benchmarking for pool-based active learning.. In: IJCAI. pp. 4679–4686. [Google Scholar]
  64. Zhan Xueying, Wang Qingzhong, Huang Kuan-hao, Xiong Haoyi, Dou Dejing, Chan Antoni B., 2022. A comparative survey of deep active learning. arXiv preprint arXiv:2203.13450. [Google Scholar]
  65. Zhao Yizhou, Bian Hengwei, Mu Michael, Uddin Mostofa R., Li Zhenyang, Li Xiang, Wang Tianyang, Xu Min, 2024. Training-free cryoet tomogram segmentation. arXiv preprint arXiv:2407.06833. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Zhao Yixiu, Zeng Xiangrui, Guo Qiang, Xu Min, 2018b. An integration of fast alignment and maximum-likelihood methods for electron subtomogram averaging and classification. Bioinformatics 34 (13), i227–i236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Zhao Guannan, Zhou Bo, Wang Kaiwen, Jiang Rui, Xu Min, 2018a. Respond-cam: Analyzing deep models for 3d imaging data by visualizations. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16–20, 2018, Proceedings, Part I. Springer, pp. 485–492. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Zhou Bo, Guo Qiang, Wang Kaiwen, Zeng Xiangrui, Gao Xin, Xu Min, 2018. Feature decomposition based saliency detection in electron cryo-tomograms. In: 2018 IEEE International Conference on Bioinformatics and Biomedicine. BIBM, IEEE, pp. 2467–2473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Zhou Bo, Yu Haisu, Zeng Xiangrui, Yang Xiaoyan, Zhang Jing, Xu Min, 2020. One-shot learning with attention-guided segmentation in cryo-electron tomography. Front. Mol. Biosci. 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Zhu Wangjiang, Liang Shuang, Wei Yichen, Sun Jian, 2014. Saliency optimization from robust background detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2814–2821. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementary document

Data Availability Statement

The authors are unable or have chosen not to specify which data has been used.

RESOURCES