. 2025 Feb 1;15:3980. doi: 10.1038/s41598-025-86623-6

Table 5.

Between-participant comparisons for advice evaluations (Study 4). The columns represent the mean evaluations for ChatGPT and human-generated advice (“M”) and their corresponding standard deviations (“SD”). The right-hand side shaded portion of the table presents the results of within-participant t-tests with the mean difference between the variables, with associated t-statistic, p-value, Cohen’s d effect sizes, and 95% confidence interval (“Lower CI” and “Upper CI”).

ChatGPT-Ratings	ORDER				Mean diff.	t	p-value	Cohen’s d	Lower CI	Upper CI
ChatGPT-Ratings	ChatGPT First		Self First
Variable	M	SD	M	SD
Effectiveness	5.55	1.12	5.68	1.24	0.13	−1.04	.300	−0.11	−0.37	0.12
Quality	5.46	1.29	5.74	1.26	0.28	−2.11	.035	−0.22	−0.55	−0.02
Authenticity	5.16	1.37	5.34	1.51	0.19	−1.23	.218	−0.13	−0.48	0.11

*Self-Ratings*	ORDER				*Mean diff.*	t	*p-value*	Cohen’s d	Lower CI	Upper CI
*Self-Ratings*	ChatGPT First		Self First
Variable	M	SD	M	SD
Effectiveness	5.67	0.88	5.62	1.03	−0.05	0.51	.607	0.05	−0.15	0.25
Quality	5.47	1.19	5.48	1.20	0.01	−0.08	.934	−0.01	−0.26	0.24
Authenticity	6.18	0.83	6.45	0.65	0.26	−3.36	.001	−0.35	−0.42	−0.11