. Author manuscript; available in PMC: 2024 Dec 1.

Published in final edited form as: Artif Intell Med. 2023 Nov 1;146:102701. doi: 10.1016/j.artmed.2023.102701

Table 4:

The 3-step NLP/ML application and corresponding techniques among the 79 studies included in the systematic review^*

Steps and techniques	N	%
Step 1: Preprocessing	60	75.9
Annotation	38	48.1
Text tokenization	36	45.6
Remove stop-words	18	22.8
Part-of-speech (POS) tagging	16	20.3
Normalization	14	17.7
Lemmatization/ stemming	12	15.2
Step 2: Feature extraction and representations	69	87.3
Rule-based NLP	37	46.8
Affirmation/ negation	33	41.8
Word2vec/ bag-of-words (BOW)	23	29.1
Name entity recognition (NER)	16	20.3
N-gram (Term Frequency–Inverse Document Frequency [TF-IDF], Document-Term Matrix [DTM], Term-Document Matrix [TDM])	15	19.0
Latent Dirichlet Allocation (LDA) for topic modeling	5	6.3
Latent semantic indexing (LSI)	1	1.3
Knowledge graph	1	1.3
Step 3: Data analysis (non-neural ML)	39	49.4
Support vector machine (SVM)	18	22.8
Decision tree (DT)	6	7.6
Conditional random fields (CRF)	9	11.4
Logistic regression classifier	8	10.1
Naïve Bayesian	6	7.6
Random forest (RF)	6	7.6
K-means clustering	3	3.8
K-nearest neighborhood (KNN)	3	3.8
Boosting (e.g., Light Gradient Boosting Machine [LightGBM], eXtreme Gradient Boosting [XGBoost])	2	2.5
Linear regression classifier	2	2.5
Bagging	1	1.3
Step 3: Data analysis (neural ML)	22	27.8
Convolutional neural network (CNN)	10	12.7
Recurrent Neural Network (RNN) (e.g., Bi-LSTM, GRU, Glove)	10	12.7
Artificial neural network (ANN) Feed forward network (FFN)	7	8.9
Transformer (e.g, Bidirectional Encoder Representations from Transformers [BERT], Bio-BERT)	3	3.8
Auto encoder	3	3.8
Embeddings from Language Model (ELMo)	1	1.3
Others	2	2.5

Abbreviations: Bi-LSTM, Bi-Long Short-Term Memory; BERT, Bidirectional Encoder Representations from Transformers.

See Supplementary Table S5 for a list of references