Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2021 Mar 30.

Published in final edited form as: IEEE Trans Med Imaging. 2021 Mar 19;PP:10.1109/TMI.2021.3067512. doi: 10.1109/TMI.2021.3067512

Statistical analysis of the pairwise comparison experiments with Thurstone model with open source implementation available at https://github.com/mantiuk/pwcmp.

(a) The box plots show an estimated distribution of the perceived quality for each reconstruction method, which shows probabilities of selecting one reconstruction method over all others. The estimation is used only to check observer responses for potential outliers, and it is not used as the statistical measure for comparing reconstructions. The blue circles represent the answers of the observers which have non-consistent behaviour with rest of the observers computed as an inter-quartile-normalised score. The observers with the inter-quartile-normalised score higher than 0.2 are considered as a potential outlier and excluded from further analyses.

(b) Visualisation of Just-Objectionable-Differences (JOD) given as scaled results of Thurstone model with their confidence intervals. BASELINE does not have a confidence interval, as this is a reference method for computing other methods relative scores. Confidence intervals represent the range in which the estimated quality values lie with 95% confidence, yet they are not a measure of the statistical significance of the difference between methods. The difference of 1 JOD indicates that 75% of observers selected one condition as better than the other. The confidence intervals are estimated with bootstrapping on 5000 random samples.

(c) The representation of an analysis of a statistical significance for the comparison of the reconstruction methods. Red points indicate reconstruction method, and they are connected to the compared methods of the interest. The statistically significant differences are shown as solid blue lines, as opposed to red dashed lines. The x-axis represents the scaling scores from plot (2b).