Skip to main content
. Author manuscript; available in PMC: 2024 Dec 1.
Published in final edited form as: Artif Intell Med. 2023 Nov 1;146:102701. doi: 10.1016/j.artmed.2023.102701

Table 4:

The 3-step NLP/ML application and corresponding techniques among the 79 studies included in the systematic review*

Steps and techniques N %
Step 1: Preprocessing 60 75.9
Annotation 38 48.1
Text tokenization 36 45.6
Remove stop-words 18 22.8
Part-of-speech (POS) tagging 16 20.3
Normalization 14 17.7
Lemmatization/ stemming 12 15.2
Step 2: Feature extraction and representations 69 87.3
Rule-based NLP 37 46.8
Affirmation/ negation 33 41.8
Word2vec/ bag-of-words (BOW) 23 29.1
Name entity recognition (NER) 16 20.3
N-gram (Term Frequency–Inverse Document Frequency [TF-IDF], Document-Term Matrix [DTM], Term-Document Matrix [TDM]) 15 19.0
Latent Dirichlet Allocation (LDA) for topic modeling 5 6.3
Latent semantic indexing (LSI) 1 1.3
Knowledge graph 1 1.3
Step 3: Data analysis (non-neural ML) 39 49.4
Support vector machine (SVM) 18 22.8
Decision tree (DT) 6 7.6
Conditional random fields (CRF) 9 11.4
Logistic regression classifier 8 10.1
Naïve Bayesian 6 7.6
Random forest (RF) 6 7.6
K-means clustering 3 3.8
K-nearest neighborhood (KNN) 3 3.8
Boosting (e.g., Light Gradient Boosting Machine [LightGBM], eXtreme Gradient Boosting [XGBoost]) 2 2.5
Linear regression classifier 2 2.5
Bagging 1 1.3
Step 3: Data analysis (neural ML) 22 27.8
Convolutional neural network (CNN) 10 12.7
Recurrent Neural Network (RNN) (e.g., Bi-LSTM, GRU, Glove) 10 12.7
Artificial neural network (ANN) Feed forward network (FFN) 7 8.9
Transformer (e.g, Bidirectional Encoder Representations from Transformers [BERT], Bio-BERT) 3 3.8
Auto encoder 3 3.8
Embeddings from Language Model (ELMo) 1 1.3
Others 2 2.5

Abbreviations: Bi-LSTM, Bi-Long Short-Term Memory; BERT, Bidirectional Encoder Representations from Transformers.

*

See Supplementary Table S5 for a list of references