Skip to main content
. Author manuscript; available in PMC: 2024 Sep 12.
Published in final edited form as: Proc Mach Learn Res. 2024 May;238:4195–4203.

Figure 2:

Figure 2:

Architecture of 3D attention mechanism. Given a tensor from the k-th group with b patients, p features, T time points, and c channels, the temporal transformer layer processes inputs of shape (1,1,T,c) to learn temporal dependency. The feature transformer layer processes inputs of shape (1,p,1,c) to learn feature dependency. The patient transformer layer processes inputs of shape (b,1,1,c) to learn patient dependency.