Efficient and Dynamically Consistent Joint Torque Estimation for Wearable Neurotechnology via Knowledge Distillation

. 2026 Apr 17;13(4):474. doi: 10.3390/bioengineering13040474

Algorithm A1 PDC-KD Training Procedure

1:
Input: Dataset D, teacher model T, student model S
2:
Initialize student parameters $θ_{S}$ ; set Cholesky factor $L = I$
3:
Load precomputed Teacher Fisher matrix $F_{T}$ and projection operator $P_{T}$
4:
for epoch $= 1$ to 200 do
5:
if epoch $< 10$ then
6:
Apply linear warm-up
7:
end if
8:
for batch $(x, y) \in D$ do
9:
Compute teacher output $τ_{T} = T (x_{cwt})$
10:
Compute student output $τ_{S} = S (x_{raw})$
11:
Apply SG filtering to obtain $({\dot{ω}}_{filt}, ω_{filt})$
12:
Compute equivalent inertia tensor $I_{eff} = L L^{⊤}$
13:
Compute $L_{data}, L_{KD}, L_{Fisher}, L_{subspace}, L_{phy}$
14:
Aggregate total loss $L_{total}$ (Equation (10))
15:
Perform backpropagation with gradient clipping to update $θ_{S}$ and L
16:
end for
17:
end for
18:
Output: Trained student model S