Cost-sensitive learning methods for imbalanced data
Citations Over TimeTop 10% of 2010 papers
Abstract
Class imbalance is one of the challenging problems for machine learning algorithms. When learning from highly imbalanced data, most classifiers are overwhelmed by the majority class examples, so the false negative rate is always high. Although researchers have introduced many methods to deal with this problem, including resampling techniques and cost-sensitive learning (CSL), most of them focus on either of these techniques. This study presents two empirical methods that deal with class imbalance using both resampling and CSL. The first method combines and compares several sampling techniques with CSL using support vector machines (SVM). The second method proposes using CSL by optimizing the cost ratio (cost matrix) locally. Our experimental results on 18 imbalanced datasets from the UCI repository show that the first method can reduce the misclassification costs, and the second method can improve the classifier performance.
Related Papers
- → Balanced Importance Resampling for the Bootstrap(1993)19 cited
- → Balanced bootstrap resampling method for neural model selection(2011)9 cited
- → An improved resampling approach for particle filters in tracking(2017)9 cited
- High Real-time Resampling Algorithm for Particle Filters(2009)
- → Computer Intensive Testing for the Influence Between Time Series(2006)36 cited