. 2023 Nov 22;13:20512. doi: 10.1038/s41598-023-46995-z

Table 5.

Results of the correlation analysis with Pearson correlation coefficient and obtained p-value given in the brackets along with p-value obtained from the Mann–Whitney U test comparing the values of the discrimination power index for correct and incorrect answers for temperature parameter equal to 0.

	S22	A22	S23
Polish
GPT-3.5
Pearson correlation coefficient (p-value)	− 0.124 (0.083 ns)	− 0.243 (< 0.001***)	− 0.327 (< 0.001***)
p-value from Mann–Whitney U test	0.053 ns	< 0.001***	< 0.001***
Cohen’s d	0.251	0.499	0.690
GPT-4
Pearson correlation coefficient (p-value)	− 0.029 (0.690 ns)	− 0.249 (< 0.001***)	− 0.185 (0.010**)
p-value from Mann–Whitney U test	0.410 ns	0.001**	0.031*
Cohen’s d	0.067	0.661	0.482
English
GPT-3.5
Pearson correlation coefficient (p-value)	− 0.103 (0.150 ns)	− 0.176 (0.013*)	− 0.182 (0.011*)
p-value from Mann–Whitney U test	0.090 ns	0.005**	0.009**
Cohen’s d	0.210	0.363	0.394
GPT-4
Pearson correlation coefficient (p-value)	− 0.011 (0.877 ns)	− 0.146 (0.041*)	− 0.140 (0.051 ns)
p-value from Mann–Whitney U test	0.704 ns	0.072 ns	0.137 ns
Cohen’s d	0.026	0.368	0.380