Table 3.
Comparison of the performances of models with different numbers of heads in the self-attention layer.
| Number of heads | MAE | RMSE | PCC |
|---|---|---|---|
| 1 | 5.92 ±0.62 | 7.56 ±0.78 | 0.44 ±0.11 |
| 2 | 6.89 ± 0.31 | 8.73 ± 0.51 | 0.31 ± 0.16 |
| 4 | 6.92 ± 0.48 | 8.71 ± 0.61 | 0.34 ± 0.14 |
| 8 | 7.09 ± 0.58 | 8.96 ± 0.54 | 0.33 ± 0.11 |