0 citations0 references

Self-Training with Selection-by-Rejection

2012pp. 795–803

Citations Over Time

Yan Zhou, Murat Kantarcıoğlu, Bhavani Thuraisingham

Abstract

Practical machine learning and data mining problems often face shortage of labeled training data. Self-training algorithms are among the earliest attempts of using unlabeled data to enhance learning. Traditional self-training algorithms label unlabeled data on which classifiers trained on limited training data have the highest confidence. In this paper, a self-training algorithm that decreases the disagreement region of hypotheses is presented. The algorithm supplements the training set with self-labeled instances. Only instances that greatly reduce the disagreement region of hypotheses are labeled and added to the training set. Empirical results demonstrate that the proposed self-training algorithm can effectively improve classification performance.

Related Papers

Email classification with co-training(2011)
→ Integrating Co-Training and Recognition for Text Detection(2005)9 cited
→ Reinforced Co-Training(2018)2 cited
→ A new cross-training approach by using labeled data(2009)