Constrained Oversampling: An Oversampling Approach to Reduce Noise Generation in Imbalanced Datasets With Class Overlapping
Citations Over TimeTop 10% of 2020 papers
Abstract
Imbalanced datasets are pervasive in classification tasks and would cause degradation of the performance of classifiers in predicting minority samples. Oversampling is effective in resolving the class imbalance problem. However, existing oversampling methods generally introduce noise examples into original datasets, especially when the datasets contain class overlapping regions. In this study, a new oversampling method named Constrained Oversampling is proposed to reduce noise generation in oversampling. This algorithm first extracts overlapping regions in the dataset. Then Ant Colony Optimization is applied to define the boundaries of minority regions. Third, oversampling under constraints is employed to synthesize new samples to get a balanced dataset. Our proposal distinguishes itself from other techniques by incorporating constraints in the oversampling process to inhibit noise generation. Experiments show that it outperforms various benchmark oversampling approaches. The explanation for the effectiveness of our method is given by studying the impact of class overlapping on imbalanced learning.
Related Papers
- → The Jeopardy of Learning from Over-Sampled Class-Imbalanced Medical Datasets(2023)12 cited
- → Design of Jitter Spectral Shaping as Robust with Various Oversampling Techniques in OFDM(2017)4 cited
- → Investigating the Stability of SMOTE-Based Oversampling on COVID-19 Data(2023)1 cited
- → An Oversampling Technique for Classifying Imbalanced Datasets(2017)
- → Oversampling And Low Order ΣΔ Modulators(2007)