Table 3.
The deep learning methods for mental illness detection.
Type | Method | Description |
---|---|---|
CNN-based methods | Standard CNN122–127 | Standard CNN structure: convolutional layer, pooling layer and fully connected layer. Some studies also incorporate other textual features (like POS, LIWC, BoW, etc.). |
Multi-Gated LeakyReLU CNN (MGL-CNN)128 | Two hierarchical (post-level and user-level)neural network models with gated units and convolutional networks. | |
Graph model combined with Convolutional Neural Network129 | A unified hybrid model combining CNN with factor graph model which leverages social interactions and content. | |
RNN-based methods | LSTM or GRU (some with attention mechanism)32,133,136,232–234 | Standard RNN structure: Long Short-Term Memory networks(LSTM) or Gate Recurrent Unit(GRU), and some studies add attention mechanism. |
Hierarchical Attention Network (HAN) with GRU138 | The GRU with a word-level attention layer and a sentence-level attention layer. | |
LSTM with transfer learning140,141 | Using transfer learning on open dataset for model pre-training. | |
LSTM or GRU with multi-task learning142,235–237 | Using multi-task learning to help illness detection get better result. The tasks include multi-risky behaviors classification, severity score prediction,word vector classification,and sentiment classification. | |
LSTM or GRU with reinforcement learning143,144 | Using reinforcement learning to automatically select the important posts. | |
LSTM or GRU with multiple instance learning145,146 | Using multiple instance learning to get the possibility of post-level labels and improve the prediction of user-level labels. | |
SISMO139 | An ordinal hierarchical LSTM attention model | |
Transformer-based methods | Self-attention models148,149 | Using the encoder structure of transformer which has self-attention module. |
BERT-based models (BERT150,151, DistilBERT152, RoBERTa153, ALBERT150, BioClinical BERT31, XLNET154, GPT-1155) | Different BERT-based pre-trained models. | |
Hybrid-based methods | LSTM+CNN156–160 | Combining LSTM with CNN to extract local features and sequence features. |
STATENet (using transformer and LSTM)161 | A time-aware transformer combining emotional and historical information. | |
Sub-emotion network164,165,238 | Integrating Bag-of-Sub-Emotion embeddings into LSTM to get emotional information. | |
Events and Personality traits for Stress Prediction (EPSP) model239 | A joint memory network for learning the dynamics of user’s emotions and personality. | |
PHASE166 | A time and phase-aware model that learns historical emotional features from users. | |
Hyperbolic graph convolution networks167 | Hyperbolic Graph Convolutions with the Hawkes process to learn the historical emotional spectrum of a user. |