Table 4.
Motion reconstruction performance comparison.
| Method | Joint position error (mm) ↓ | Posture similarity (%) ↑ | Temporal smoothness (mm/s²) ↓ | Computational efficiency (FPS) ↑ | Semantic consistency (%) ↑ | Overall score ↑ |
|---|---|---|---|---|---|---|
| HMR [Ref] | 89.7 ± 5.3 | 71.3 ± 3.8 | 87.4 ± 6.2 | 35.8 ± 1.2 | 62.4 ± 4.7 | 68.5 ± 3.9 |
| VIBE [Ref] | 76.2 ± 4.8 | 78.5 ± 3.2 | 69.3 ± 5.1 | 28.6 ± 1.5 | 68.7 ± 3.9 | 73.9 ± 3.5 |
| Text2Action | 82.3 ± 5.7 | 75.6 ± 4.1 | 74.8 ± 5.9 | 23.7 ± 1.8 | 73.5 ± 4.2 | 72.7 ± 4.3 |
| ActionBERT | 68.9 ± 4.2 | 81.3 ± 3.6 | 63.4 ± 4.7 | 19.4 ± 1.6 | 77.8 ± 3.6 | 78.2 ± 3.5 |
| KG-VAE | 63.5 ± 3.9 | 83.7 ± 3.1 | 58.7 ± 4.3 | 18.3 ± 1.4 | 82.6 ± 3.2 | 81.5 ± 3.1 |
| KGLEAN | 61.2 ± 3.8 | 84.9 ± 2.8 | 55.3 ± 4.1 | 16.9 ± 1.3 | 84.3 ± 3.0 | 82.7 ± 2.9 |
| KG-CMGAN (Ours) | 43.8 ± 3.2 | 89.6 ± 2.3 | 41.5 ± 3.7 | 21.7 ± 1.1 | 91.2 ± 2.5 | 88.6 ± 2.4 |
Bold values indicate the best performance for each metric/column among all compared methods.