Table 5.
Experimental results from different NLP models. The test results are macroaverage classification values.
Methods (model name) and models | Precision (%) | Recall (%) | F1 score (%) | |
Convolution neural network (CNN) |
|
|
|
|
|
CNN-randoma | 89.33 | 90.67 | 89.91 |
|
CNN-fixed-Word2Vecb | 88.01 | 93.12 | 90.43 |
|
CNN-Word2Vecc | 92.01 | 92.87 | 92.33 |
Long short-term memory |
|
|
|
|
|
Unidirectional long short-term memory | 87.23 | 93.89 | 90.32 |
|
Bidirectional long short-term memory | 87.97 | 92.48 | 90.09 |
Transformer encoder |
|
|
|
|
|
Bidirectional encoder representations from transformers | 86.44 | 89.69 | 87.99 |
|
ELECTRAd-version 1 | 87.73 | 92.12 | 89.82 |
|
ELECTRA-version 2 | 91.03 | 92.33 | 91.60 |
Data trimming |
|
|
|
|
|
CNN-Word2Vec (trimmede) | 90.59 | 93.56 | 91.98 |
|
Unidirectional long short-term memory (trimmed) | 84.77 | 93.30 | 88.61 |
|
ELECTRA-v2 (trimmed) | 89.63 | 94.47 | 91.92 |
Ensemble combination |
|
|
|
|
|
CNN-Word2Vec + Uni-LSTM | 89.53 | 94.24 | 91.76 |
|
SCENTf-v1: CNN-Word2Vec + ELECTRA-v2 (trimmed) | 91.10 | 94.18 | 92.56 |
|
Unidirectional long short-term memory + ELECTRA-v2 (trimmed) | 89.53 | 94.24 | 91.76 |
|
CNN-Word2Vec + unidirectional long short-term memory + ELECTRA-v2 (trimmed) | 91.02 | 94.19 | 92.52 |
Hierarchical ensemble |
|
|
|
|
|
CNN-Word2Vec and unidirectional long short-term memory + ELECTRA-v2 (trimmed) | 91.30 | 92.86 | 91.92 |
|
Unidirectional long short-term memory and CNN-Word2Vec + ELECTRA-v2 (trimmed) | 86.83 | 93.88 | 90.09 |
|
SCENT-v2: ELECTRA-v2 (trimmed) and CNN-Word2Vec + unidirectional long short-term memory | 89.04 | 94.44 | 91.58 |
aRandom: randomly initialized embedding.
bFixed-Word2Vec: nontrainable pretrained Word2Vec embedding.
cWord2Vec: trainable pretrained Word2Vec embedding.
dELECTRA: efficiently learning an encoder that classifies token replacements accurately.
eTrimmed: data sets are trimmed based on the keyword “thyroid” in the comprehensive medical examination text part.
fSCENT: static and contextualized ensemble NLP network.