0 citations0 references

Enriching the knowledge sources used in a maximum entropy part-of-speech tagger

2000Vol. 13, pp. 63–70

Citations Over TimeTop 10% of 2000 papers

Kristina Toutanova, Christopher D. Manning

Abstract

This paper presents results for a maximum-entropy-based part of speech tagger, which achieves superior performance principally by enriching the information sources used for tagging. In particular, we get improved results by incorporating these features: (i) more extensive treatment of capitalization for unknown words; (ii) features for the disambiguation of the tense forms of verbs; (iii) features for disambiguating particles from prepositions and adverbs. The best resulting accuracy for the tagger on the Penn Treebank is 96.86% overall, and 86.91% on previously unseen words.

Related Papers

→ Building an Indonesian rule-based part-of-speech tagger(2014)42 cited
A Two-Stage Approach to Chinese Part-of-Speech Tagging(2008)
→ Chinese part of speech tagging based on maximum entropy method(2003)4 cited
→ Mongolian Part-of-Speech Tagging with Neural Networks(2021)2 cited
→ Part-Of-Speech Tagging in French: State-of-the-Art and Obstacles(2020)