Model Selection for Type-Supervised Learning with Application to POS Tagging
Citations Over TimeTop 23% of 2015 papers
Abstract
Model selection (picking, for example, the feature set and the regularization strength) is crucial for building high-accuracy NLP models. In supervised learning, we can estimate the accuracy of a model on a subset of the labeled data and choose the model with the highest accuracy. In contrast, here we focus on type-supervised learning, which uses constraints over the possible labels for word types for supervision, and labeled data is either not available or very small. For the setting where no labeled data is available, we perform a comparative study of previously proposed and one novel model selection criterion on type-supervised POS-tagging in nine languages. For the setting where a small labeled set is available, we show that the set should be used for semi-supervised learning rather than for model selection onlyusing it for model selection reduces the error by less than 5%, whereas using it for semi-supervised learning reduces the error by 44%.
Related Papers
- → Semi-supervised learning: a brief review(2018)232 cited
- → Semi-Supervised Learning(2014)93 cited
- → Incorporate active learning to semi-supervised industrial fault classification(2019)35 cited
- → Recommender system designed using an ensemble approach to semi-supervised learning(2022)1 cited
- → Biodegradables(1994)