Skip to main content
letter
. 2024 Oct 15;26:e60695. doi: 10.2196/60695

Table 2.

Performance of retrieval-augmented large language models in matching physician clinical trial recommendations.

Performance Precision (%) Recall (%) F1-score
Baseline GPT-4 0.0 0.0 0
Retrieval-augmented GPT-4 63.0 100.0 0.77
Subgroups (cancer types)

Head and neck cancers 72.7 100.0 0.84

Thyroid cancers 33.3 100.0 0.50

Skin cancers 50.0 100.0 0.67

Salivary gland cancers 36.4 100.0 0.53

Other cancers a
Subgroups (biomarkers)

Present 72.7 100.0 0.84

None 62.1 100.0 0.77

aNot applicable.