Table 2:
Evaluation results of trained BioBERT-base models trained with three different text processing and sampling methods (evidence-only, rule-based filtered, and raw ClinVar data) and pre-trained BioBERT-base on orthogonal generated DMS Data
Model | Accuracy | Precision | Recall | F1 Score | Pair-wise AUC | Avg AUC-ROC | ||
---|---|---|---|---|---|---|---|---|
|
||||||||
P/LP vs B/LB | P/LP vs VUS | B/LB vs VUS | ||||||
| ||||||||
BioBERT-base + ClinVar (evidence-only) | 0.4753 | 0.4930 | 0.4753 | 0.4219 | 0.9272 | 0.8043 | 0.5470 | 0.7595 |
BioBERT-base + ClinVar (rule-based) | 0.4891 | 0.5098 | 0.4891 | 0.4399 | 0.9096 | 0.7938 | 0.5377 | 0.7470 |
BioBERT-base + ClinVar (raw-data) | 0.4840 | 0.5306 | 0.4840 | 0.4192 | 0.9037 | 0.7882 | 0.5826 | 0.7582 |
BioBERT-base [9] | 0.2713 | 0.0736 | 0.2713 | 0.1158 | 0.3953 | 0.5428 | 0.3953 | 0.4503 |