Skip to main content
. Author manuscript; available in PMC: 2024 May 21.
Published in final edited form as: Adv Neural Inf Process Syst. 2021 Dec;2021(DB1):1–20.

Table 18:

Complexity results for datasets in the robotics domain. U: unimodal models, M: multimodal fusion paradigms, O: optimization objectives, T: training structures.

Dataset MuJoCo Push
Metric Epochs trained Training time (s) Training params (M) Training peak memory (MB) Inference time (s) Inference params (M)
U Unimodal (i)
Unimodal (f)
Unimodal (p)
Unimodal (c)
20
20
20
20
738±133
288±39
252±6
372±64
3.88
3.33
3.33
3.33
3607±1
3595±2
3594±1
3594±1
3.46±0.02 0.91±0.08
0.87±0.04
0.86±0.04
3.88
3.33
3.33
3.33

M EF
LF-LSTM
TF-LSTM [179]
MulT [156]
20
20
20
20
815±34
856±46
1914±31
4792±62
3.92
1.90
23.5
14.6
3654±1
3636±1
4530±9
6530±16
4.44±0.55 4.32±0.45
7.75±0.12
22.4±0.28
3.92
1.90
23.5
14.6
Dataset Vision&Touch
Metric Epochs trained Training time (s) Training params (M) Training peak memory (MB) Inference time (s) Inference params (M)
U Unimodal (i)
Unimodal (f)
Unimodal (p)
15
1
5
15
2633
2185
2514
1.00
0.13
0.08
5530
2426
2389
63.9
51.6
59.5
1.00
0.13
0.08

M LF
Sensor Fusion [91]
LRTF [106]
15
50
35
2672
11604
8366
1.20
1.10
1.09
5572
4467
4987
64.4
62.6
64.4
1.20
1.10
1.09

O RefNet [135] 15 3819 135 6067 65.0 1.20