Skip to main content
. Author manuscript; available in PMC: 2021 May 26.
Published in final edited form as: Front Comput Sci. 2021 May 12;3:624683. doi: 10.3389/fcomp.2021.624683

Table 6.

The best classification cases of the audio-based, text-based, and multi-modal models. AD: Alzheimer’s Disease. Accuracy: mean and standard deviation of results of 5 rounds. Best: highest accuracy of all epochs in 5 rounds.

Input Model (with pre-training) Classes Precision % Recall % Fl% Accuracy % Best %
Audio YAMNet non-AD 69.60 ± 6.80 59.20 ± 7.73 63.40 ± 5.57 66.20 ± 4.79 83.33
AD 64.40 ± 3.93 73.40 ± 8.82 68.60 ± 4.84
Text Longformer non-AD 77.87 ± 3.75 90.00 ± 2.04 83.44 ± 2.33 82.08 ± 2.83 89.58
AD 88.14 ± 2.09 74.17 ± 5.53 80.44 ± 3.55
Audio + Text Dual BERT Concat / Joint (BERT large) non-AD 83.62 ± 4.25 82.50 ± 5.53 82.80 ± 1.76 82.92 ± 1.56 87.50
AD 83.04 ± 3.97 83.33 ± 5.89 82.92 ± 1.86