Table 6.
Comparison of test performance (and standard deviation) on the BIOSSES and BioASQ tasks with major layer-specific adaptation methods, all with models
| Baseline | Layer reeze | Layerwise decay | Layer reinit | |
|---|---|---|---|---|
| BIOSSES | ||||
| BioBERT- | 84.92 (10.2) | 88.65 (6.60) | 90.13 (1.70) | 91.53∗ (4.09) |
| BlueBERT- | 82.35 (2.21) | 84.56 (1.95) | 84.80 (2.58) | 86.18∗ (1.21) |
| PubMedBERT- | 91.06 (1.51) | 91.19 (1.12) | 90.87 (0.92) | 92.73∗ (0.96) |
| PubMedELECTRA- | 71.61 (4.91) | 86.42 (3.33) | 86.17 (1.21) | 90.33∗ (1.04) |
| BioASQ | ||||
| BioBERT- | 67.79 (6.59) | 74.57 (3.18) | 78.93∗ (3.39) | 74.43 (9.76) |
| BlueBERT- | 70.43 (3.91) | 70.21 (2.99) | 72.21∗ (2.68) | 70.86 (2.15) |
| PubMedBERT- | 92.36 (1.36) | 93.14 (2.35) | 93.36∗ (1.22) | 91.21 (1.54) |
| PubMedELECTRA- | 79.93 (5.04) | 88.64 (4.09) | 93.14∗ (1.71) | 88.07 (3.21) |
Highest performance for model (row).