. 2023 Nov 22;13:20512. doi: 10.1038/s41598-023-46995-z

Table 4.

Results of the correlation analysis with Pearson correlation coefficient and obtained p-value given in the brackets along with p-value obtained from the Mann–Whitney U test comparing the values of the index of difficulty for correct and incorrect answers for temperature parameter equal to 1.

	S22	A22	S23
Polish
GPT-3.5
Pearson correlation coefficient (p-value)	0.319 (< 0.001***)	0.301 (< 0.001***)	0.174 (0.015**)
p-value from Mann–Whitney U test	< 0.001***	< 0.001***	0.004**
Cohen’s d	0.671	0.634	0.362
GPT-4
Pearson correlation coefficient (p-value)	0.338 (< 0.001***)	0.334 (< 0.001***)	0.299 (< 0.001***)
p-value from Mann–Whitney U test	< 0.001***	< 0.001***	< 0.001***
Cohen’s d	0.834	0.861	0.829
English
GPT-3.5
Pearson correlation coefficient (p-value)	0.229 (0.001**)	0.245 (< 0.001***)	0.219 (0.002**)
p-value from Mann–Whitney U test	< 0.001***	< 0.001***	0.003**
Cohen’s d	0.474	0.511	0.453
GPT-4
Pearson correlation coefficient (p-value)	0.363 (< 0.001***)	0.273 (< 0.001***)	0.307 (< 0.001***)
p-value from Mann–Whitney U test	< 0.001***	0.001**	< 0.001***
Cohen’s d	0.933	0.714	0.802