Skip to main content
. 2025 Mar 18;28(5):112235. doi: 10.1016/j.isci.2025.112235

Table 2.

Model network structural hyperparameters

Parameter Value Explanation
n_cyc 30 Number of previous cycles used for model input.
batch_size 512 Batch size for training and validation.
Lr 8e-4 Learning rate for training.
num_epochs 12,000 Maximum number of training epochs.
Patience 1,600 Early stopping patience.
Alpha [0.1] ∗ 10 Capacity loss weight during pre-training.
in_ch 4 Number of input channels for convolution layers.
out_ch [8, 16, 64] Number of output channels for convolution layers.
Kernel 3 Kernel size for convolution layers.
Stride 2 Stride for convolution layers.
Padding 0 Padding for convolution layers.
embed_dim 64 Embedding dimension for multi-head attention layers.
num_heads 2 Number of attention heads in multi-head attention layers.
Dropout 0.3 Dropout rate for multi-head attention layers.
dense_1 64 Number of neurons in the first dense layer.
dense_2 64 Number of neurons in the second dense layer.
finetune_lr 2e-5 Learning rate for fine-tuning.
train_alpha [0.09] ∗ 9 + [0] Fine-tuning capacity loss weights during training.
valid_alpha [0.09] ∗9 + [0] Fine-tuning capacity loss weights during validation.
finetune_epochs 800 Maximum number of fine-tuning epochs.