Table 6.
The best classification cases of the audio-based, text-based, and multi-modal models. AD: Alzheimer’s Disease. Accuracy: mean and standard deviation of results of 5 rounds. Best: highest accuracy of all epochs in 5 rounds.
| Input | Model (with pre-training) | Classes | Precision % | Recall % | Fl% | Accuracy % | Best % |
|---|---|---|---|---|---|---|---|
| Audio | YAMNet | 66.20 ± 4.79 | 83.33 | ||||
| AD | 64.40 ± 3.93 | 73.40 ± 8.82 | 68.60 ± 4.84 | ||||
| Text | Longformer | 82.08 ± 2.83 | 89.58 | ||||
| AD | 88.14 ± 2.09 | 74.17 ± 5.53 | 80.44 ± 3.55 | ||||
| Audio + Text | Dual BERT Concat / Joint (BERT large) | 82.92 ± 1.56 | 87.50 | ||||
| AD | 83.04 ± 3.97 | 83.33 ± 5.89 | 82.92 ± 1.86 |