Skip to main content
. 2022 Jul 2;22:181. doi: 10.1186/s12874-022-01665-y

Table 3.

Model architectures

Model Number of Filters/Units/Encoders Embedding Dimension Max Sequence Length Dropout Activation Function Optimizer Total Parameters
CNN 8 200 557 0.3 ReLU Adam 5.51 M
RNN 8 200 557 0.3 ReLU Adam 5.50 M
GRU 8 200 557 0.3 ReLU Adam 5.50 M
LSTM 8 200 557 0.3 ReLU Adam 5.50 M
Bi-LSTM 8 200 557 0.3 ReLU Adam 5.51 M
Transformer Encoder 1 encoder (2 heads) 200 557 0.3 ReLU Adam 5.94 M
BERT-Base 12 encoders (12 heads) 768 512

0.3

(fine-tune layer)

ReLU (fine-tune layer) Adam (fine-tune layer) 110 M