Skip to main content
. Author manuscript; available in PMC: 2026 Apr 9.
Published in final edited form as: Knowl Based Syst. 2025 Nov 12;331:114810. doi: 10.1016/j.knosys.2025.114810

Table 1.

Summary of the proposed framework.

Dimension Component Function / Output
Input 3D input with XY, XZ and YZ projections Orthogonal cryo-ET views providing complementary cues
Encoder Transformer encoder with multi-view tokens Captures cross-view semantic consistency
Fusion Graph-based aggregation module Models spatial and frequency-level relationships
Decoder Multi-scale convolutional layers Produces voxel-wise segmentation output
Learning Objective View-masked SSL + CE loss Jointly optimizes reconstruction and segmentation