| 1: Setting the mentor learning rate ηt and mentee learning rate ηs, client number N
|
| 2: Setting the hyperparameters Tstart and Tend
|
| 3: for each client i (in parallel) do
|
| 4: Initialize parameters , Θs
|
| 5: repeat
|
| 6: ,gi=LocalGradients(i) |
| 7:
|
| 8: gi ← Ui∑iVi
|
| 9: Clients encrypt Ui, ∑i, Vi
|
| 10: Clients upload Ui, ∑i, Vi to the server |
| 11: Server decrypts Ui, ∑i, Vi
|
| 12: Server reconstructs gi
|
| 13: Global gradients g ← 0 |
| 14: for each client i (in parallel) do
|
| 15: g = g + gi
|
| 16: end for
|
| 17: g ← U∑V
|
| 18: Server encrypts U, ∑, V
|
| 19: Server distributes U, ∑, V to user clients |
| 20: Clients decrypt U, ∑, V
|
| 21: Clients reconstructs g
|
| 22: Θs ← Θs − ηsg/N
|
| 23: until Local models converges |
| 24: end for
|
| LocalGradients(i): |
| 25: Compute task losses and
|
| 26: Compute losses , , , and
|
| 27:
|
| 28:
|
| 29: Compute local mentor gradients from
|
| 30: Compute local mentee gradients gi from
|
| 31: return
|