Table 2.
Scene Graph Detection |
Scene Graph Classification |
Predicate Classification |
||||||||
---|---|---|---|---|---|---|---|---|---|---|
Model | R@20 | R@50 | R@100 | R@20 | R@50 | R@100 | R@20 | R@50 | R@100 | |
Baselines | BASELINE [n = 10] | 0.00 | 0.00 | 0.00 | 0.04 | 0.04 | 0.04 | 3.17 | 5.30 | 6.61 |
FREQ | 9.01 | 11.01 | 11.64 | 11.10 | 11.08 | 10.92 | 20.98 | 20.98 | 20.80 | |
FREQ+OVERLAP | 10.16 | 10.84 | 10.86 | 9.90 | 9.91 | 9.91 | 20.39 | 20.90 | 22.21 | |
TRANSFER LEARNING | 11.99 | 14.40 | 16.48 | 17.10 | 17.91 | 18.16 | 39.69 | 41.65 | 42.37 | |
DECISION TREE [38] | 11.11 | 12.58 | 13.23 | 14.02 | 14.51 | 14.57 | 31.75 | 33.02 | 33.35 | |
LABEL PROPAGATION [57] | 6.48 | 6.74 | 6.83 | 9.67 | 9.91 | 9.97 | 24.28 | 25.17 | 25.41 | |
Ablations | OURS (DEEP) | 2.97 | 3.20 | 3.33 | 10.44 | 10.77 | 10.84 | 23.16 | 23.93 | 24.17 |
OURS (SPAT.) | 3.26 | 3.20 | 2.91 | 10.98 | 11.28 | 11.37 | 26.23 | 27.10 | 27.26 | |
OURS (CATEG.) | 7.57 | 7.92 | 8.04 | 20.83 | 21.44 | 21.57 | 43.49 | 44.93 | 45.50 | |
OURS (CATEG. + SPAT. + DEEP) | 7.33 | 7.70 | 7.79 | 17.03 | 17.35 | 17.39 | 38.90 | 39.87 | 40.02 | |
OURS (CATEG. + SPAT. + WORDVEC) | 8.43 | 9.04 | 9.27 | 20.39 | 20.90 | 21.21 | 45.15 | 46.82 | 47.32 | |
OURS (MAJORITY VOTE) | 16.86 | 18.31 | 18.57 | 18.96 | 19.57 | 19.66 | 44.18 | 45.99 | 46.63 | |
OURS (CATEG. + SPAT.) | 17.67 | 18.69 | 19.28 | 20.91 | 21.34 | 21.44 | 45.49 | 47.04 | 47.53 | |
ORACLE [nORACLE = 108n] | 24.42 | 29.67 | 30.15 | 30.15 | 30.89 | 31.09 | 69.23 | 71.40 | 72.15 |