Table 1:

F1-scores for DepAudioNet using the DAIC-WOZ dataset for three features, with and without adversarial speaker disentanglement. F1-Avg is the average of F1-scores for non-depressed (F1-ND) and depressed (F1-D) classes. +Adv denotes adversarial training. λ values used for disentanglement are mentioned in parenthesis.

Input Feature	+Adv	F1-Avg	F1-ND	F1-D
Mel-spec	No	0.619	0.706	0.533
Mel-spec	Yes (λ = 5e-6)	0.646	0.732	0.560
Raw-audio	No	0.646	0.779	0.512
Raw-audio	Yes (λ = 1e-6)	0.660	0.726	0.594
Wav2vec2.0	No	0.686	0.804	0.567
Wav2vec2.0	Yes (λ = 1e-3)	0.692	0.808	0.576