0 citations0 references

Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately

2021pp. 196–205

Citations Over TimeTop 10% of 2021 papers

Abstract

Spurious features interfere with the goal of obtaining robust models that perform well across many groups within the population. A natural remedy is to remove such features from the model. However, in this work, we show that removing spurious features can surprisingly decrease accuracy due to the inductive biases of overparameterized models. In noiseless overparameterized linear regression, we completely characterize how the removal of spurious features affects accuracy across different groups (more generally, test distributions). In addition, we show that removal of spurious features can decrease the accuracy even on balanced datasets (where each target co-occurs equally with each spurious feature); and it can inadvertently make the model more susceptible to other spurious features. Finally, we show that robust self-training produces models that no longer depend on spurious features without affecting their overall accuracy. The empirical results on the Toxic-Comment-Detection and CelebA datasets show that our results hold in non-linear models.

Citations Over TimeTop 10% of 2021 papers

Abstract

Related Papers