Skip to main content
. 2023 May 25;35(7):101596. doi: 10.1016/j.jksuci.2023.101596

Table 2.

Experimental hyper-parameters.

Model parameters Loss functions Categorical cross-entropy loss and Categorical smooth loss
Optimizers Adam and SGD
Learning rates 0.0001 and 0.001
Input size 224 × 224
Epoch 100
Batch size 8
Multi-head attention model Parameters Patch size (2, 2)
Window size Window Size//2
Number of heads 8
Number of MLP 256
Embed_dim 64
Drop rate 0.01
Learning rate selection regulating parameters Reduce learning rate 0.2
Verbose 1
Epsilon 0.001
Es-Callback (Patience) 10
Clip value 0.2
Patience 10