Table 2.
Taxonomy of data fusion methods based on multimodal DL. Early fusion strategies are subcategorized according to the applied architecture. Intermediate strategies are subcategorized according to their type of layers in the unimodal branches and whether a joint representation is learned. Late fusion strategies are subcategorized according to their type of aggregation
Fusion strategy | Taxonomy Subcategory 1 | Taxonomy Subcategory 2 | Papers |
---|---|---|---|
Approach | Architecture | ||
Early fusion | Direct modeling | Fully connected | [17–19] |
Convolutional | [20–23] | ||
Recurrent | [20, 24] | ||
Autoencoder | Regular | [25–34] | |
Denoising | [33, 35–37] | ||
Stacked | [37–40] | ||
Variational | [33, 40–42] | ||
Branch | Representation | ||
Intermediate fusion | Homogeneous design | Marginal | [43–49] |
Joint | [21, 28, 38, 41, 50–63] | ||
Heterogeneous designs | Marginal | [64–68] | |
Joint | [69–81] | ||
Aggregation | Model contribution | ||
Late fusion | Averaging | Equal | [82–84] |
Weighted | [85–87] | ||
Meta-learning | Weighted | [83, 88] |