Let Samples Speak: Mitigating Spurious Correlation by Exploiting the Clusterness of Samples

Mitigating Spurious Correlation by Exploiting the Clusterness of Samples

¹University of Electronic Science and Technology of China
²Shihezi University

Background

Deep learning models are known to often learn features that spuriously correlate with the class label during training but are irrelevant to the prediction task. For instance, a model trained to identify waterbirds might learn to focus on the presence of water in the background rather than the bird itself. These examples of CAMs illustrate that the model's attention is often directed towards these irrelevant features rather than the actual object of interest. This can lead to models that perform well on training data but fail to generalize to new, unseen data where these spurious correlations do not hold.

Proposed Method

We develop an efficient (costing few minutes) DNNs debiasing pipeline of identifying, neutralizing, eliminating and updating.

We introduce Neutralizing Spurious Features (NSF), a debiasing method that does not require prior knowledge of bias attributes. NSF consists of four key steps:

(1) Identifying Bias Presence: Minority samples that deviate from the class centroid are identified, as such deviations indicate the presence of spurious features.
(2) Neutralizing Spurious Feature for Bias-Invariant Features: Use identified groups to estimate a bias-invariant representation for each class.
(3) Eliminating Spurious Feature: Learn a common transformation across all classes that aligns all training samples within a class to the estimated bias-invariant features. This transformation eliminates spurious features while preserving core features.
(4) Updating Classifier: Finetune the classifier on these bias-invariant features, forcing reliance on core features alone.

Results

Experiments on four image and NLP debiasing benchmarks and one medical dataset demonstrate the effectiveness of our proposed approach, showing an improvement of worst-group accuracy by over 20% compared to standard empirical risk minimization (ERM).

As illustrated in the CAMs, for Waterbirds, the ERM model focuses on the background, while the debiased model centers on the bird's body. In CelebA, the ERM model highlights the hair, whereas the debiased model focuses on the face. In CheXpert, the ERM model targets medical devices, while the debiased model concentrates on clinically relevant areas. These visualizations show that the found biases are aligned with the known bias (the background, the hair color, and medical devices) in those datasets and debiasing leads to models using more relevant patterns.

BibTeX

@inproceedings{li2024let, title={Let Samples Speak: Mitigating Spurious Correlation by Exploiting the Clusterness of Samples}, author={Weiwei Li and Junzhuo Liu and Yuanyuan Ren and Yuchen Zheng and Yahao Liu and Wen Li}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2025}, }