Feature Noise Induces Loss Discrepancy Across Groups

Part of Proceedings of the International Conference on Machine Learning 1 pre-proceedings (ICML 2020)

Bibtex »Metadata »Paper »Supplemental »


Fereshte Khani, Percy Liang


It has been observed that the performance of standard learning procedures differs widely across groups. Recent studies usually attribute this loss discrepancy to an information deficiency for one group (e.g., one group has less data). In this work, we point to a more subtle source of loss discrepancy---feature noise. Our main result is that even when there is no information deficiency specific to one group (e.g., both groups have infinite data), adding the same amount of feature noise to all individuals leads to loss discrepancy. For linear regression, we characterize this loss discrepancy in terms of the amount of noise and difference between moments of the two groups. We then study the time it takes for an estimator to adapt to a shift in the population that makes the groups have the same mean. We finally validate our results on three real-world datasets.