Table 3.
Adjusted Pearson residuals of large language model decisions.
| Models and affiliation level | Q1a (adjusted Pearson residuals) | Q2b (adjusted Pearson residuals) | Q3c (adjusted Pearson residuals) | Q4d (adjusted Pearson residuals) | Q1 and Q2 (adjusted Pearson residuals) | Q3 and Q4 (adjusted Pearson residuals) | |||||||
| Llama 3.3-70B | |||||||||||||
|
|
None | −1.063 | 0.8343 | 0.4576 | −1.0017 | −0.2288 | −0.5440 | ||||||
|
|
High tier | 11.2757 | −0.8985 | −0.5720 | 0.5008 | 0.3773 | −0.0712 | ||||||
|
|
Low tier | −0.2126 | 0.0642 | 0.1144 | 0.5008 | −0.1484 | 0.6152 | ||||||
| Mistral-7B | |||||||||||||
|
|
None | −0.0649 | −1.0999 | 1.4668 | 0 | −1.1648 | 1.4668 | ||||||
|
|
High tier | 0.1298 | 0.8105 | −1.1734 | 0 | 0.9403 | −1.1734 | ||||||
|
|
Low tier | −0.0649 | 0.2895 | −0.2934 | 0 | 0.2246 | −0.2934 | ||||||
| Gemma 2-9B | |||||||||||||
|
|
None | 0 | −0.2619 | −1.4974 | 2.8717 e | −0.2619 | 1.3743 | ||||||
|
|
High tier | 0 | 0.5237 | 0.5240 | −1.6231 | 0.5237 | −1.0991 | ||||||
|
|
Low tier | 0 | −0.2619 | 0.9732 | −1.2486 | −0.2619 | −0.2753 | ||||||
| DeepSeek r1-distill Qwen-14B | |||||||||||||
|
|
None | 0.5377 | −0.9462 | 0.9841 | −0.3647 | −0.4086 | 0.6195 | ||||||
|
|
High tier | −1.0753 | 0.8280 | −0.7526 | 0.3647 | −0.2474 | −0.3879 | ||||||
|
|
Low tier | 0.5377 | 0.1183 | −0.2316 | 0 | 0.6559 | −0.2316 | ||||||
| Qwen 2.5-7B | |||||||||||||
|
|
None | 2.1374 | −1.7008 | 1.2394 | −1.0017 | 0.4366 | 0.2377 | ||||||
|
|
High tier | −0.7125 | 1.7008 | −1.7352 | 2.0033 | 0.9884 | 0.2682 | ||||||
|
|
Low tier | −1.4249 | 0 | 0.4958 | −1.0017 | −1.4249 | −0.5059 | ||||||
aQ1: quartile 1.
bQ2: quartile 2.
cQ3: quartile 3.
dQ4: quartile 4.
eItalicization indicates statistically significant residuals.