Skip to main content
. 2025 Sep 23;11:e71102. doi: 10.2196/71102

Table 4. Performance of baseline models in named entity recognition.

Model, learning strategy, and entity Precision Recall F1-score Standard error (F1) Lower CI (F1) Upper CI (F1)
Bert-base-cased
 SFTa
  Cancer_type 0.5366 0.6286 0.5789 0.0493 0.4821 0.6756
  Indicated_symptom 0.1667 0.1429 0.1538 0.0360 0.08 0.2245
  Product 0.6773 0.7161 0.6962 0.0459 0.6060 0.7863
  Micro average 0.6514 0.6905 0.6704 0.047 0.5782 0.7625
  Macro average 0.4602 0.4959 0.4763 0.0499 0.3784 0.5741
  Weighted average 0.6495 0.6905 0.6692 0.0470 0.5769 0.7614
Bio_ClinicalBERT
 SFT
  Cancer_type 0.5349 0.697 0.6053 0.0489 0.5094 0.7011
  Indicated_symptom 0.3000 0.2143 0.2500 0.0433 0.1651 0.3348
  Product 0.695 0.6583 0.6762 0.0468 0.5844 0.7679
  Micro average 0.6675 0.6462 0.6567 0.0474 0.5636 0.75
  Macro average 0.5100 0.5232 0.5105 0.0499 0.4125 0.6084
  Weighted average 0.6684 0.6462 0.6558 0.0475 0.5626 0.7489
 Zero-shot
  Cancer_type 0.2885 0.6818 0.4054 0.0490 0.3091 0.5016
  Indicated_symptom 0.0759 0.4615 0.1304 0.0336 0.06 0.1964
  Product 0.3529 0.3243 0.338 0.0473 0.2452 0.4307
  Micro average 0.2776 0.3619 0.3142 0.0464 0.2232 0.4051
  Macro average 0.2391 0.4892 0.2913 0.0454 0.2022 0.3803
  Weighted average 0.3334 0.3619 0.3333 0.0471 0.2409 0.4256
gpt4-1106-preview-chat
 Few-shot
  Cancer_type 0.3148 0.7727 0.4474 0.0497 0.3499 0.5448
  Indicated_symptom 0.0536 0.2308 0.087 0.0281 0.03 0.1422
  Product 0.4743 0.5405 0.5053 0.0499 0.4073 0.6032
  Micro average 0.3857 0.5447 0.4516 0.0498 0.3540 0.5491
  Macro average 0.2809 0.5147 0.3465 0.0476 0.2532 0.4397
  Weighted average 0.4394 0.5447 0.4791 0.0499 0.3811 0.5770
 Many-shot
  Cancer_type 0.4000 0.6364 0.4912 0.05 0.3932 0.589
  Indicated_symptom 0 0 0 0 0 0
  Product 0.5672 0.5135 0.539 0.0498 0.44 0.6367
  Micro average 0.5079 0.4981 0.5029 0.05 0.4049 0.601
  Macro average 0.3224 0.3833 0.3434 0.0474 0.2503 0.4364
  Weighted average 0.5242 0.4981 0.5077 0.0499 0.4097 0.6056
a

SFT: supervised fine-tuning.