Table 8:
Hyper-parameters for models and experiments.
| Category | Hyperparameter | Amazon-PQ | Product-Reviews | |||||
|---|---|---|---|---|---|---|---|---|
| The large graph corpus | Search-CTR | ESCvsI | Query2PT | The large graph corpus | CoPurchase | Product2PT | ||
| Model architecture | Number of GNN aggregator layers | 1–2 | 1 | 1 | 1 | 1–2 | 1 | 1 |
| Dimension of GNN aggregator hidden layers | 256 | 256 | 256 | 256 | 256 | 256 | 256 | |
| Number of attention heads of RGAT | 4 | 4 | 4 | 4 | 4 | 4 | 4 | |
| Type of decoders | 1-layer MLP | 1-layer MLP | 1-layer MLP | 1-layer MLP | 1-layer MLP | 1-layer MLP | 1-layer MLP | |
| Experimental setups | Learning rate of parameters of LMs | 1e-8 | 1e-7 | 1e-6 | 1e-7 | 1e-8 | 1e-7 | 1e-7 |
| Learning rate of parameters of GNNs | 5e-4 | 5e-4 | 1e-4 | 5e-4 | 5e-4 | 5e-4 | 5e-4 | |
| Optimizer | Adam | Adam | Adam | Adam | Adam | Adam | Adam | |
| Batch size for training | 512 | 512 | 512 | 512 | 512 | 512 | 512 | |
| Batch size for evaluation | 1024 | 1024 | 1024 | 1024 | 1024 | 1024 | 1024 | |
| Max number of tokens | 256 | 256 | 256 | 256 | 256 | 256 | 256 | |