0 citations0 references

Building an Indonesian rule-based part-of-speech tagger

2014pp. 70–73

Citations Over TimeTop 10% of 2014 papers

Rashel Fam, Andry Luthfi, Arawinda Dinakaramani, Hendra Manurung

Abstract

This paper describes work on a part-of-speech tagger for the Indonesian language by employing a rule-based approach. The system tokenizes documents while also considering multi-word expressions and recognizes named entities. It then applies tags to every token, starting from closed-class words to open-class words and disambiguates the tags based on a set of manually defined rules. The system currently obtains an accuracy of 79% on a manually tagged corpus of roughly 250.000 tokens.

Citations Over TimeTop 10% of 2014 papers

Abstract

Related Papers