Figure 1. -SNE and UMAP visualizations for pre-trained CLIP, ERM-tuned CLIP, and CFR (ours) on Waterbirds.
We observe that both the pre-trained and ERM-tuned CLIP exhibit noticeable spurious correlations, with feature separations inappropriately aligned with spurious attributes, specifically the background, rather than the target class. In contrast, our method, as visualized through -SNE and UMAP, demonstrates a significantly improved class separations, underscoring the robustness of our method in reducing spurious correlations.