Table 1.
Conceptual comparison of the proposed method and three representative SOTA approaches.
| Criterion | Proposed (Ours) | FedEntropy | EDS-FL | SER |
|---|---|---|---|---|
| Design paradigm | Regularised representation optimisation | Entropy-constrained quantisation | Task-oriented distillation | Stochastic latent encoding |
| Entropy reduction potential | High (explicit regularisation of representation entropy) | Moderate (adaptive to local entropy) | Moderate (guided reduction via distillation) | Low (uncontrolled stochastic variability) |
| Semantic fidelity | High (controlled latent structure) | Moderate (depends on encoder configuration) | Moderate (teacher-guided relevance) | Low (sample-level uncertainty dominates) |
| Task adaptivity | High (task-aware training objectives) | Moderate (manual hyperparameter tuning) | Moderate (aligned via teacher) | Low (static latent sampling) |
| Generalisation capability | High (validated for unseen classes/tasks) | Low (encoder-specific generalisation) | Moderate (task-bound distillation) | Low (limited transferability) |
| Compression adaptability | High (context-aware encoding) | Moderate (entropy-profile driven) | Moderate (fixed distillation targets) | High (sampling enables variability, less control) |
| Latent space consistency | High (stabilised via regularisation) | Moderate | Low (inter-client feature divergence) | Low (sampling noise) |
| Interpretability | Moderate–High (structured latent space) | Low | Moderate | Low |
| Scalability | High (50 + clients, multiple tasks supported) | Moderate (tested on small-scale setups) | Low (distillation overhead) | Moderate (client-specific tuning needed) |
| Communication efficiency | High (learned, compact, and robust encoding) | Moderate (entropy heuristics) | Low (additional teacher–client exchange) | Moderate (bit-level control, low robustness) |
| Training complexity | Moderate (no external modules) | Moderate | High (teacher synchronisation required) | Moderate–High (training instability) |
| Robustness to data heterogeneity | High (validated under non-IID distributions) | Moderate (entropy adapts partially) | Low (sensitive to class imbalance) | Low (amplified by stochasticity) |
| Inference suitability for edge devices | High (low latency and model size) | Moderate | Low (teacher model size dominates) | Moderate (requires sampling overhead) |