Skip to main content
. 2024 Nov 19;45(17):e26799. doi: 10.1002/hbm.26799

FIGURE 2.

FIGURE 2

Our proposed spatio‐modality attention module for multimodal fusion. The concatenated embeddings are sent through two different branches. (i) The modality branch that learns the cross‐modality interactions and mounts it into Tm attention mask and (ii) the spatial branch (Ts) captures the relevant context from each biological site. These two masks are merged into a final attention mask Tf. The spatial branch uses dilated convolution for learning the contextual understanding of the multimodal tensor and fully connected layer for modality attention.