. 2024 Nov 14;4:e12. doi: 10.1017/S2633903X2400014X

Table 2.

Metrics for clustering, linear evaluation, and LPIPS ⁽ ⁵⁸ ⁾ in VGG11 models on MNIST⁽ ²⁹ ⁾ using MoCov2⁽ ¹¹ ⁾ and various transformations are shown. Specific transformations’ effects are examined across training configurations. The First Set, in bold, yields digit representations, while the Second Set focuses on handwriting style and thickness. Top1 Accuracy is from a Linear Evaluation, and LPIPS, using an AlexNet⁽ ²⁷ ⁾ backbone, reflects perceptual similarity. Silhouette scores⁽ ⁴¹ ⁾ suggest good cluster quality in the second set, despite AMI scores indicating inaccurate digit cluster capture.

Transformation sets	Silhouette	AMI	Top1 Acc	LPIPS
Rotation+Crop	0.74	0.79	98.4	0.22
Rotation+Crop+Padding	0.78	0.81	99.3	0.25
Rotation+Crop+Padding +ColorInversion (First Set)	0.87	0.83	99.6	0.33
Rotation+Crop+Flips	0.71	0.66	96.2	0.32
Rotation+Crop+Flips+RandomErasing (Second Set)	0.66	0.37	62.1	0.51