TABLE 5.
KEP results of association rule mining (ARM) baseline (A), DSKG R generated using Pandaset (DSKG-P R ) and NuScenes (DSKG-N R ) on three algorithms, each experiment averaged with standard deviation across five runs (B,C), followed by the results of the additional investigations: different KG structures (D,E) and integration of external knowledge (F). Evaluation metrics: MRR = Mean Reciprocal Rank, H@K= Hits@K, Accu. = KEP Accuracy, Micro/Macro F1 = Micro/Macro-averaged-F1-score.
Ranking metrics | KEP performance metrics | ||||||||
---|---|---|---|---|---|---|---|---|---|
MRR | H@1 | H@3 | H@10 | Accu. (%) | Micro F1 | Macro F1 | |||
(A) | ARM | — | — | — | — | — | 27.19 | 0.16 | 0.06 |
(B) | DSKG-P R | TransE | 0.32 ± 0.03 | 0.16 ± 0.05 | 0.35 ± 0.04 | 0.71 ± 0.03 | 22.98 ± 4.33 | 0.26 ± 0.04 | 0.20 ± 0.02 |
HolE | 0.93 ± 0.00 | 0.87 ± 0.01 | 0.98 ± 0.00 | 1.00 ± 0.00 | 88.91 ± 0.64 | 0.90 ± 0.01 | 0.87 ± 0.00 | ||
ConvKB | 0.29 ± 0.01 | 0.11 ± 0.02 | 0.31 ± 0.02 | 0.86 ± 0.02 | 17.83 ± 1.99 | 0.22 ± 0.02 | 0.17 ± 0.02 | ||
(C) | DSKG-N R | TransE | 0.42 ± 0.03 | 0.22 ± 0.03 | 0.51 ± 0.03 | 0.91 ± 0.01 | 28.08 ± 2.45 | 0.32 ± 0.03 | 0.20 ± 0.01 |
HolE | 0.23 ± 0.01 | 0.11 ± 0.01 | 0.22 ± 0.01 | 0.51 ± 0.03 | 13.80 ± 0.84 | 0.16 ± 0.01 | 0.11 ± 0.01 | ||
ConvKB | 0.49 ± 0.02 | 0.31 ± 0.04 | 0.60 ± 0.02 | 0.91 ± 0.01 | 36.35 ± 2.96 | 0.40 ± 0.03 | 0.20 ± 0.01 | ||
(D) | DSKG Bi | TransE | 0.41 | 0.19 | 0.52 | 0.97 | 29.03 | 0.34 | 0.32 |
HolE | 0.29 | 0.11 | 0.28 | 0.87 | 16.55 | 0.19 | 0.20 | ||
ConvKB | 0.23 | 0.07 | 0.21 | 0.68 | 12.30 | 0.16 | 0.14 | ||
(E) | DSKG Prot | TransE | 0.26 | 0.10 | 0.28 | 0.62 | 17.77 | 0.21 | 0.18 |
HolE | 0.33 | 0.17 | 0.32 | 0.81 | 23.70 | 0.27 | 0.22 | ||
ConvKB | 0.30 | 0.10 | 0.36 | 0.86 | 19.21 | 0.24 | 0.20 | ||
(F) | DSKGSE | TransE | 0.30 | 0.18 | 0.32 | 0.50 | 24.53 | 0.27 | 0.17 |
HolE | 0.81 | 0.69 | 0.92 | 0.98 | 74.52 | 0.82 | 0.81 | ||
ConvKB | 0.29 | 0.13 | 0.32 | 0.71 | 21.01 | 0.26 | 0.22 |
“Bold” values in (B, C) indicate the peak performance for each metric in DSKG-R, while “underlined” values in (D,E, and F) indicate the same for each additional investigation.