Skip to main content
. 2024 Mar 26;5(5):100964. doi: 10.1016/j.patter.2024.100964

Table 3.

Comparison of the ESS-MB with various contrastive learning models trained on House100K

CL model ESS-MB CL backbone Pretext task
Downstream ImageNet classification
Training loss ↓ Training loss ↓ Test loss ↓ Test accuracy (%) ↑
SimCLR ResNet-50 0.15 ± 0.01 4.88 ± 0.01 4.84 ± 0.01 16.81 ± 0.05
SimCLR ResNet-50 0.59 ± 0.01 4.70 ± 0.01 4.79 ± 0.01 17.71 ± 0.13∗
DCL ResNet-50 3.75 ± 0.06 4.67 ± 0.01 4.71 ± 0.01 17.62 ± 0.11
DCL ResNet-50 3.86 ± 0.00 4.66 ± 0.01 4.69 ± 0.02 18.15 ± 0.10∗
CLSA ResNet-50 11.44 ± 0.00 4.16 ± 0.03 4.06 ± 0.03 24.77 ± 0.33
CLSA ResNet-50 11.23 ± 0.00 3.89 ± 0.01 3.83 ± 0.01 27.77 ± 0.22∗
NNCLR ResNet-18 3.39 ± 0.24 1,555 ± 8.26 7.03 ± 0.15 3.55 ± 0.03
MoCo v.2 ResNet-18 3.89 ± 0.10 5.75 ± 0.01 5.71 ± 0.01 7.96 ± 0.08∗
MoCo v.3 ViT 1.87 ± 0.01 4.58 ± 0.02 4.47 ± 0.02 19.27 ± 0.21
MoCo v.3 ViT 2.11 ± 0.00 4.57 ± 0.01 4.46 ± 0.01 19.84 ± 0.13∗

CL stands for contrastive learning. The ✓ means ESS-MB is implemented on a specified contrastive learning model. We compare NNCLR with ESS-MB on MoCo v.2, as NNCLR’s different definition of positive pairs complicates the direct application of ESS-MB on NNCLR. The better downstream classification result for each model type is denoted with an asterisk.