Boxplot of the stability score of models trained before (left) and after (right) fine-tuning for 100 epochs using the stability score as a training metric
The models were further trained with either 2, 4, or 8 bits of latent redundancy and a block size of 8 or 16 bits. A red line represents the median, a green triangle represents the mean, and the outliers are represented by green dots. The difference in means between the two groups is statisticlly significant (t-statistic: 2.91, p value: 0.008).