Table 2.
Time and resource utilization of different attention types for ADN on Metr-LA.
| Model | No. Parameters | Training Time (s/epoch) | Inference Time (ms/sample) | Peak GPU Usage (GB) |
|---|---|---|---|---|
| ADN-DA | 331 K | 30 | 7.8 | 9.9 |
| ADN-GA | 331 K | 46 | 1.9 | 4.7 |
| ADN-RA | 324 K | 50 | 8.1 | 11 |
| ADN-FA | 331 K | 23 | 2.4 | 4.5 |
| ADN-EA | 331 K | 27 | 1.7 | 5.5 |
| ADN-LA | 341 K | 23 | 2.1 | 4.5 |
| ADN-FV | 330 K | 25 | 2.2 | 4.7 |