Table 6. Validation results for deep learning models on the 50 K review dataset.
| Method | Accuracy | Std. dev. | Class | Precision | Recall | F1-score |
|---|---|---|---|---|---|---|
| RNN | 0.953 | 0.0010 | Positive | 0.975 | 0.983 | 0.979 |
| Neutral/Mixed | 0.744 | 0.729 | 0.736 | |||
| Negative | 0.934 | 0.888 | 0.911 | |||
| GRU | 0.954 | 0.0001 | Positive | 0.973 | 0.986 | 0.979 |
| Neutral/Mixed | 0.764 | 0.720 | 0.741 | |||
| Negative | 0.941 | 0.890 | 0.914 | |||
| LSTM | 0.953 | 0.0001 | Positive | 0.974 | 0.984 | 0.979 |
| Neutral/Mixed | 0.750 | 0.721 | 0.736 | |||
| Negative | 0.938 | 0.889 | 0.913 | |||
| BERT | 0.964 | 0.0057 | Positive | 0.990 | 0.980 | 0.985 |
| Neutral/Mixed | 0.843 | 0.750 | 0.794 | |||
| Negative | 0.944 | 0.930 | 0.937 |