Table 5. Validation results for deep learning models on the 21 K review dataset.
| Method | Accuracy | Std. dev. | Class | Precision | Recall | F1-score |
|---|---|---|---|---|---|---|
| RNN | 0.946 | 0.0017 | Positive | 0.970 | 0.981 | 0.975 |
| Neutral/Mixed | 0.688 | 0.621 | 0.652 | |||
| Negative | 0.930 | 0.914 | 0.922 | |||
| GRU | 0.950 | 0.0011 | Positive | 0.971 | 0.983 | 0.977 |
| Neutral/Mixed | 0.710 | 0.627 | 0.665 | |||
| Negative | 0.935 | 0.924 | 0.930 | |||
| LSTM | 0.945 | 0.0007 | Positive | 0.971 | 0.976 | 0.974 |
| Neutral/Mixed | 0.676 | 0.621 | 0.646 | |||
| Negative | 0.925 | 0.936 | 0.931 | |||
| BERT | 0.964 | 0.0027 | Positive | 0.975 | 0.993 | 0.984 |
| Neutral/Mixed | 0.834 | 0.679 | 0.748 | |||
| Negative | 0.955 | 0.946 | 0.951 |