Algorithm 1.
The process of our algorithm description.
| 1: Input: Dataset ; xi is the input sentence; yi is the corresponding label. |
| 2: The load pre-trained model tokenizes a sentence by splitting the sentence into words or subwords and then pads all lists to the same size. |
| 3: Use the distilBert model to train the dataset to obtain the embedding vector. |
| 4: Put the embedding vector into the logistic regression model to classify the dataset. |
| 5: Model evaluation. |