Table 4. Performance of baseline models in named entity recognition.
| Model, learning strategy, and entity | Precision | Recall | F1-score | Standard error (F1) | Lower CI (F1) | Upper CI (F1) |
|---|---|---|---|---|---|---|
| Bert-base-cased | ||||||
| SFTa | ||||||
| Cancer_type | 0.5366 | 0.6286 | 0.5789 | 0.0493 | 0.4821 | 0.6756 |
| Indicated_symptom | 0.1667 | 0.1429 | 0.1538 | 0.0360 | 0.08 | 0.2245 |
| Product | 0.6773 | 0.7161 | 0.6962 | 0.0459 | 0.6060 | 0.7863 |
| Micro average | 0.6514 | 0.6905 | 0.6704 | 0.047 | 0.5782 | 0.7625 |
| Macro average | 0.4602 | 0.4959 | 0.4763 | 0.0499 | 0.3784 | 0.5741 |
| Weighted average | 0.6495 | 0.6905 | 0.6692 | 0.0470 | 0.5769 | 0.7614 |
| Bio_ClinicalBERT | ||||||
| SFT | ||||||
| Cancer_type | 0.5349 | 0.697 | 0.6053 | 0.0489 | 0.5094 | 0.7011 |
| Indicated_symptom | 0.3000 | 0.2143 | 0.2500 | 0.0433 | 0.1651 | 0.3348 |
| Product | 0.695 | 0.6583 | 0.6762 | 0.0468 | 0.5844 | 0.7679 |
| Micro average | 0.6675 | 0.6462 | 0.6567 | 0.0474 | 0.5636 | 0.75 |
| Macro average | 0.5100 | 0.5232 | 0.5105 | 0.0499 | 0.4125 | 0.6084 |
| Weighted average | 0.6684 | 0.6462 | 0.6558 | 0.0475 | 0.5626 | 0.7489 |
| Zero-shot | ||||||
| Cancer_type | 0.2885 | 0.6818 | 0.4054 | 0.0490 | 0.3091 | 0.5016 |
| Indicated_symptom | 0.0759 | 0.4615 | 0.1304 | 0.0336 | 0.06 | 0.1964 |
| Product | 0.3529 | 0.3243 | 0.338 | 0.0473 | 0.2452 | 0.4307 |
| Micro average | 0.2776 | 0.3619 | 0.3142 | 0.0464 | 0.2232 | 0.4051 |
| Macro average | 0.2391 | 0.4892 | 0.2913 | 0.0454 | 0.2022 | 0.3803 |
| Weighted average | 0.3334 | 0.3619 | 0.3333 | 0.0471 | 0.2409 | 0.4256 |
| gpt4-1106-preview-chat | ||||||
| Few-shot | ||||||
| Cancer_type | 0.3148 | 0.7727 | 0.4474 | 0.0497 | 0.3499 | 0.5448 |
| Indicated_symptom | 0.0536 | 0.2308 | 0.087 | 0.0281 | 0.03 | 0.1422 |
| Product | 0.4743 | 0.5405 | 0.5053 | 0.0499 | 0.4073 | 0.6032 |
| Micro average | 0.3857 | 0.5447 | 0.4516 | 0.0498 | 0.3540 | 0.5491 |
| Macro average | 0.2809 | 0.5147 | 0.3465 | 0.0476 | 0.2532 | 0.4397 |
| Weighted average | 0.4394 | 0.5447 | 0.4791 | 0.0499 | 0.3811 | 0.5770 |
| Many-shot | ||||||
| Cancer_type | 0.4000 | 0.6364 | 0.4912 | 0.05 | 0.3932 | 0.589 |
| Indicated_symptom | 0 | 0 | 0 | 0 | 0 | 0 |
| Product | 0.5672 | 0.5135 | 0.539 | 0.0498 | 0.44 | 0.6367 |
| Micro average | 0.5079 | 0.4981 | 0.5029 | 0.05 | 0.4049 | 0.601 |
| Macro average | 0.3224 | 0.3833 | 0.3434 | 0.0474 | 0.2503 | 0.4364 |
| Weighted average | 0.5242 | 0.4981 | 0.5077 | 0.0499 | 0.4097 | 0.6056 |
SFT: supervised fine-tuning.