Skip to main content
. 2023 Jun 3:1–22. Online ahead of print. doi: 10.1007/s00500-023-08507-z

Table 3.

The Hybrid Pre-processing Techniques

Datasets Hybrid Pre-processing Techniques No of features
Tokenization Stemming Lemmatization
Ott

1.Transformer-basedTokenization.

2.Byte Pair Encoding (BPE). Contextual Word Embeddings.

3.Subword Tokenization.

1. Hybrid Stemming with Sentiment Analysis.

2. Hybrid Stemming with Named Entity Recognition (NER).

3. Hybrid Stemming with Contextual Information.

1. Machine Learning-based Lemmatization.

2. Hybrid Lemmatization with Word Embeddings.

3. Hybrid Lemmatization with Named Entity Recognition (NER).

1230
Yelp 1238
Amazon 1234
Trip Advisor 1247
IMDb 1225