Skip to main content
. 2023 Jan 25;25:e34474. doi: 10.2196/34474

Table 7.

Performance comparisons of a bench model [61] and a proposed convolutional neural network (CNN) model for depression detection and Patient Health Questionnaire-9 score prediction on audio data sets in a text-dependent setting (read-dependent speech mode) and a text-independent setting (spontaneous mode; N=318).

Models Data sets

Text-dependent setting (mean of 10-fold) Text-independent setting (single fold)

ACCa (%) F1-scoreb (%) CCCc RMSEd ACC (%) F1-score (%) CCC RMSE
Proposed CNNs model 78.14 77.27 0.28 9.21 56.82 37.84 0.287 5.53
GCNN-LSTMe [61] 51.65 50.90 0.43 8.10 58.57 39.78 0.497 5.70

aACC: accuracy.

bF1-score: the weighted average of precision and recall.

cCCC: Concordance Correlation Coefficient.

dRMSE: Root Mean Square Error.

eGCNN-LSTM: Gated Convolutional Neural Network-Long Short Term Memory.