Skip to main content
. 2021 Sep 21;9(9):e30223. doi: 10.2196/30223

Table 4.

Detailed information about different NLP models.

Models Tokenizer Embedding vocabulary size Number of parameters Initial learning rate Batch size
Convolution neural network MeCab-ko 100,000 32 million 1e–3 64
Unidirectional long short-term memory MeCab-ko 100,000 46 million 2e–4 32
Bidirectional long short-term
memory
MeCab-ko 100,000 40 million 2e–4 32
Bidirectional encoder representations from transformers WordPiece 8002 92 million 2e–5 8
ELECTRAa-version 1 WordPiece 32,200 110 million 2e–5 8
ELECTRA-version 2 MeCab-ko & WordPiece 35,000 112 million 2e–5 8

aELECTRA: efficiently learning an encoder that classifies token replacements accurately.