Table 2.
Feature configurations | Input modality | LOOCV accuracy | Test set accuracy |
---|---|---|---|
Temporal + char4grams | audio + text | 0.8611 | 0.9167 |
New + char4grams | audio + text | 0.8889 | 0.8750 |
char4grams | text | 0.8611 | 0.8958 |
top three late fusion | / | 0.8796 | 0.9375 |
BERT—reimplementation of Yuan et al. (2020) | / | 0.8426 | 0.8333 |
ERNIE best related work (Yuan et al., 2020) | / | / | 0.8958 |
The feature configurations column indicates which feature configuration has been used and whether char4grams have been added, and column Input modality shows the modality on which ADR features have been generated. The best individual methods' results in LOOCV and on the test set, as well as the late fusion of all three methods, are shown in bold. The row labelled top three late fusion presents the results of employing late/decision fusion (i.e., the use of majority voting) over the three best approaches.