A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers
Citations Over TimeTop 1% of 2023 papers
Abstract
Software bias is an increasingly important operational concern for software engineers. We present a large-scale, comprehensive empirical study of 17 representative bias mitigation methods for Machine Learning (ML) classifiers, evaluated with 11 ML performance metrics (e.g., accuracy), 4 fairness metrics, and 20 types of fairness-performance tradeoff assessment, applied to 8 widely-adopted software decision tasks. The empirical coverage is much more comprehensive, covering the largest numbers of bias mitigation methods, evaluation metrics, and fairness-performance tradeoff measures compared to previous work on this important software property. We find that (1) the bias mitigation methods significantly decrease ML performance in 53% of the studied scenarios (ranging between 42%∼66% according to different ML performance metrics); (2) the bias mitigation methods significantly improve fairness measured by the 4 used metrics in 46% of all the scenarios (ranging between 24%∼59% according to different fairness metrics); (3) the bias mitigation methods even lead to decrease in both fairness and ML performance in 25% of the scenarios; (4) the effectiveness of the bias mitigation methods depends on tasks, models, the choice of protected attributes, and the set of metrics used to assess fairness and ML performance; (5) there is no bias mitigation method that can achieve the best tradeoff in all the scenarios. The best method that we find outperforms other methods in 30% of the scenarios. Researchers and practitioners need to choose the bias mitigation method best suited to their intended application scenario(s).
Related Papers
- → A measure of fairness of service for scheduling algorithms in multiuser systems(2002)81 cited
- → Design of a Fairness Guarantee Mechanism Based on Network Measurement(2007)9 cited
- → High Assurance GPS Integrity Monitoring System Using Particle Filtering Approach(2007)8 cited
- → Proportional fairness scheduling on tandem network(2003)2 cited
- Fairness Analysis and Comparison of Wireless Packet Scheduling Algorithm(2005)