Table 2. Pairwise comparison results for each quantitative evaluation.
Each cell reports the Bonferroni-corrected p-value and the corresponding effect size, with magnitude interpretation.
Inappr. Reduction | Image similarity | Text-image alignment | ||||
---|---|---|---|---|---|---|
-value | Effect size | -value | Effect size | -value | Effect size | |
SD vs. SLD-Weak | 0.372 (large) | – | – | 0.811 | 0.033 (negligible) | |
SD vs. SLD-Max | 0.803 (large) | – | – | 0.097 (negligible) | ||
SD vs. ESD | 0.189 (medium) | – | – | 0.069 (negligible) | ||
SD vs. Ours | 0.002 | 0.112 (small) | – | – | 0.004 | 0.073 (negligible) |
SLD-Weak vs. SLD-Max | 0.711 (large) | 0.525 (large) | 0.086 (negligible) | |||
SLD-Weak vs. ESD | 0.203 (medium) | 0.954 (large) | 0.062 | 0.033 (negligible) | ||
SLD-Weak vs. Ours | 0.283 (large) | 0.994 (large) | 0.073 (negligible) | |||
SLD-Max vs. ESD | 0.753 (large) | 0.973 (large) | 0.079 | 0.044 (negligible) | ||
SLD-Max vs. Ours | 0.781 (large) | 0.997 (large) | 0.121 (small) | |||
ESD vs. Ours | 0.029 | 0.084 (small) | 0.559 (large) | 0.130 (small) |