Table 1.
Precision, recall, and F1-scores of deep neural network classifiers for the class of tweets that self-report a COVID-19 diagnosis, evaluated on a held-out test set of 2000 manually annotated tweets.
| Classifier | Precision | Recall | F1-score |
| BERT-Base-Uncased | 0.82 | 0.85 | 0.84 |
| DistilBERT-Base-Uncased | 0.83 | 0.77 | 0.80 |
| RoBERTa-Large | 0.87 | 0.92 | 0.90 |
| BERTweet-Large | 0.90 | 0.91 | 0.91 |
| COVID-Twitter-BERT | 0.96 | 0.91 | 0.94 |