Table 2.
Comparison of mean scores according to question type
| Characteristic | GPT-3.5 | GPT-4.0 | low seniority |
middle seniority |
high seniority |
||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean scores | p value (with GPT-3.5) | p value (with GPT-4.0) | Mean scores | p value (with GPT-3.5) | p value (with GPT-4.0) | Mean scores | p value (with GPT-3.5) | p value (with GPT-4.0) | |||
| treatment strategy | 1.750 ± 1.260 | 2.500 ± 0.885 | 2.000 ± 1.022 | 0.377 | 0.056 | 2.080 ± 0.881 | 0.224 | 0.076 | 2.790 ± 0.415 | 0.001 | 0.110 |
| intraoperative strategy | 1.670 ± 1.291 | 2.530 ± 0.743 | 1.730 ± 1.100 | 0.806 | 0.013 | 1.930 ± 1.033 | 0.452 | 0.014 | 2.600 ± 0.507 | 0.005 | 0.719 |
| medicine option | 2.000 ± 1.095 | 2.820 ± 0.405 | 2.270 ± 0.786 | 0.432 | 0.052 | 2.820 ± 0.405 | 0.042 | 1.000 | 3.000 ± 0.000 | 0.013 | 0.167 |