Skip to main content
. Author manuscript; available in PMC: 2020 May 1.
Published in final edited form as: Proc Conf Assoc Comput Linguist Meet. 2019 Jul;2019:6558–6569. doi: 10.18653/v1/p19-1656

Figure 2:

Figure 2:

Overall architecture for MulT on modalities (L,V,A). The crossmodal transformers, which suggests latent crossmodal adaptations, are the core components of MulT for multimodal fusion.