Skip to main content

View full-text article in PMC

. 2025 Mar 27;27:e65537. doi: 10.2196/65537

Table 4.

Performance comparison of GPT-4, Qwen2-72B, Llama3-70B, and GPT-3.5 models on the sepsis dataset.

Model	Sepsis dataset
	Precision	Recall	F₁-score
Qwen2-72B	44.73	42.85	43.77
Llama3-70B	49.40	47.43	48.39
GPT-3.5	56.63	54.48	55.53
GPT-4 and Zero shot	72.12	70.48	71.29
GPT-4 and Few shot	77.73	75.81	76.76