Skip to main content

View full-text article in PMC

letter

. 2024 Oct 15;26:e60695. doi: 10.2196/60695

Table 2.

Performance of retrieval-augmented large language models in matching physician clinical trial recommendations.

Performance		Precision (%)	Recall (%)	F₁-score
Baseline GPT-4		0.0	0.0	0
Retrieval-augmented GPT-4		63.0	100.0	0.77
Subgroups (cancer types)
	Head and neck cancers	72.7	100.0	0.84
	Thyroid cancers	33.3	100.0	0.50
	Skin cancers	50.0	100.0	0.67
	Salivary gland cancers	36.4	100.0	0.53
	Other cancers	—^a	—	—
Subgroups (biomarkers)
	Present	72.7	100.0	0.84
	None	62.1	100.0	0.77

^aNot applicable.