Skip to main content
. 2025 Aug 6;25(15):4845. doi: 10.3390/s25154845
Algorithm 3 Transformer-based IoT attack
Require: Dataset D={X,y}, learning rate η0, batch size B, epochs E, attention heads h, decay factor λ, folds K
Ensure: Trained model M, evaluation metrics R
 1: Normalize features: x=xxminxmaxxmin,xX
 2: Initialize weights WQ,WK,WV,WON(0,σ2)
 3: Initialize R
 4: F Stratified K-fold split on (X,y)
 5: for all (Xtrain,ytrain,Xtest,ytest)F do
 6:     Apply SMOTE: (Xtrainres,ytrainres)SMOTE(Xtrain,ytrain)
 7:     ηη0
 8:     for e=1 to E do
 9:         Shuffle(Xtrainres,ytrainres)
 10:         for all Bi=(XB,yB)(Xtrainres,ytrainres) do
 11:              Q=WQXB, K=WKXB, V=WVXB
 12:             α=QKdk, A=softmax(α)V
 13:             H=Concat(A1,,Ah)WO
 14:             H^=LayerNorm(H+XB)
 15:             Z=W1H^+b1, Z˜=ReLU(Z)
 16:             F=W2Z˜+b2, F^=LayerNorm(F+H^)
 17:             y^=Softmax(F^)
 18:             L=i=1Nyilog(y^i)
 19:             Update weights: WWηLW, bbηLb
 20:         end for
 21:         ηη0×λeE
 22:         ReEvaluate(M,Xtest,ytest)
 23:         if EarlyStopping(Re) then
 24:              Break
 25:         end if
 26:     end for
 27:     Append Re to R
 28: end for
 29: return M,R