Skip to main content
. 2024 Oct 1;14:22797. doi: 10.1038/s41598-024-71893-3

Table 2.

Hyperparameters of our proposed ViT-GRU model with their values.

Hyperparameters Values
Epochs 35
Batch size 64
Image size 224×224×3
Learning rate 0.0001
Weight decay 0.0001
Optimizer AdamW, Adam, SGD
Loss function Categorical cross-entropy
Patch size 8
Number of patches 256
Projection dimension 64
Number of parallel self-attention heads 4
Number of transformer encoder layers 8