CNN-based methods |
Standard CNN122–127
|
Standard CNN structure: convolutional layer, pooling layer and fully connected layer. Some studies also incorporate other textual features (like POS, LIWC, BoW, etc.). |
|
Multi-Gated LeakyReLU CNN (MGL-CNN)128
|
Two hierarchical (post-level and user-level)neural network models with gated units and convolutional networks. |
|
Graph model combined with Convolutional Neural Network129
|
A unified hybrid model combining CNN with factor graph model which leverages social interactions and content. |
RNN-based methods |
LSTM or GRU (some with attention mechanism)32,133,136,232–234
|
Standard RNN structure: Long Short-Term Memory networks(LSTM) or Gate Recurrent Unit(GRU), and some studies add attention mechanism. |
|
Hierarchical Attention Network (HAN) with GRU138
|
The GRU with a word-level attention layer and a sentence-level attention layer. |
|
LSTM with transfer learning140,141
|
Using transfer learning on open dataset for model pre-training. |
|
LSTM or GRU with multi-task learning142,235–237
|
Using multi-task learning to help illness detection get better result. The tasks include multi-risky behaviors classification, severity score prediction,word vector classification,and sentiment classification. |
|
LSTM or GRU with reinforcement learning143,144
|
Using reinforcement learning to automatically select the important posts. |
|
LSTM or GRU with multiple instance learning145,146
|
Using multiple instance learning to get the possibility of post-level labels and improve the prediction of user-level labels. |
|
SISMO139
|
An ordinal hierarchical LSTM attention model |
Transformer-based methods |
Self-attention models148,149
|
Using the encoder structure of transformer which has self-attention module. |
|
BERT-based models (BERT150,151, DistilBERT152, RoBERTa153, ALBERT150, BioClinical BERT31, XLNET154, GPT-1155) |
Different BERT-based pre-trained models. |
Hybrid-based methods |
LSTM+CNN156–160
|
Combining LSTM with CNN to extract local features and sequence features. |
|
STATENet (using transformer and LSTM)161
|
A time-aware transformer combining emotional and historical information. |
|
Sub-emotion network164,165,238
|
Integrating Bag-of-Sub-Emotion embeddings into LSTM to get emotional information. |
|
Events and Personality traits for Stress Prediction (EPSP) model239
|
A joint memory network for learning the dynamics of user’s emotions and personality. |
|
PHASE166
|
A time and phase-aware model that learns historical emotional features from users. |
|
Hyperbolic graph convolution networks167
|
Hyperbolic Graph Convolutions with the Hawkes process to learn the historical emotional spectrum of a user. |