Skip to main content
. 2022 Sep 7;60(1):157–173. doi: 10.1007/s10844-022-00745-1

Table 2.

Hyper-parameters considered for pre-trained models

Model Parameter Value
GraphBERT Hidden Layer Number 2
Subgraph Size 7
Learning Rate 5e-4
Hidden Size 32
Hidden Dropout Rate 0.5
Attention Head Number 3
Attention Dropout Rate 0.4
Weight Decay 0.01
BERT Number of Transformer Blocks 12
Training Batch Size 32
Learning Rate 5e-4
Weight Decay 0.01