Resampling or Reweighting: A Comparison of Boosting Implementations
Citations Over TimeTop 20% of 2008 papers
Abstract
Boosting has been shown to improve the performance of classifiers in many situations, including when data is imbalanced. There are, however, two possible implementations of boosting, and it is unclear which should be used. Boosting by reweighting is typically used, but can only be applied to base learners which are designed to handle example weights. On the other hand, boosting by resampling can be applied to any base learner. In this work, we empirically evaluate the differences between these two boosting implementations using imbalanced training data. Using 10 boosting algorithms, 4 learners and 15 datasets, we find that boosting by resampling performs as well as, or significantly better than, boosting by reweighting (which is often the default boosting implementation). We therefore conclude that in general, boosting by resampling is preferred over boosting by weighting.
Related Papers
- → GBMVis: Visual Analytics for Interpreting Gradient Boosting Machine(2021)6 cited
- Accelerated proximal boosting(2018)
- → On Application of Machine Learning Models for Prediction of Failures in Production Lines(2021)1 cited
- → Gradient Boosting Machine: A Survey(2019)25 cited
- → Proximal boosting: aggregating weak learners to minimize non-differentiable losses(2018)