Table 5.
Comparison with modern motion generation methods.
| Method | Joint position error (mm) ↓ | Motion fidelity (FID) ↓ | Knowledge consistency (%) ↑ | Style authenticity (1–5) ↑ | Training data required | Inference speed (FPS) ↑ |
|---|---|---|---|---|---|---|
| MDM 37 | 58.3 ± 4.7 | 12.8 ± 1.5 | 73.8 ± 4.1 | 3.2 ± 0.6 | Large (100 K + samples) | 8.3 ± 0.5 |
| T2M-GPT 38 | 62.7 ± 5.1 | 15.4 ± 1.8 | 71.2 ± 4.5 | 2.9 ± 0.7 | Large (100 K + samples) | 12.7 ± 0.8 |
| Fg-T2M 39 | 54.9 ± 4.3 | 11.2 ± 1.3 | 76.5 ± 3.8 | 3.5 ± 0.6 | Large (80 K + samples) | 6.8 ± 0.4 |
| KG-CMGAN (ours) | 43.8 ± 3.2 | 9.6 ± 1.1 | 91.2 ± 2.5 | 4.6 ± 0.4 | Medium (20 K samples) | 21.7 ± 1.1 |
Bold values indicate the best performance for each metric/column among all compared methods.