Hurtlex: A Multilingual Lexicon of Words to Hurt
Citations Over TimeTop 1% of 2018 papers
Abstract
We describe the creation of HurtLex, a multilingual lexicon of hate words. The starting point is the Italian hate lexicon developed by the linguist Tullio De Mauro, organized in 17 categories. It has been expanded through the link to available synset-based computational lexical resources such as MultiWordNet and BabelNet, and evolved in a multi-lingual perspective by semi-automatic translation and expert annotation. A twofold evaluation of HurtLex as a resource for hate speech detection in social media is provided: a qualitative evaluation against an Italian annotated Twitter corpus of hate against immigrants, and an extrinsic evaluation in the context of the AMI@Ibereval2018 shared task, where the resource was exploited for extracting domain-specific lexicon-based features for the supervised classification of misogyny in English and Spanish tweets.
Related Papers
- 다중 사용자 환경에서 Annotation 인터페이스의 설계 및 구현(2002)
- Social Filtering 환경에서 사용자 관심사를 고려한 Annotation 디스플레이 설계 및 구현(2002)
- On the Important Content Characters about Annotation of Xiaojing by Tang Xuan_zong(2005)
- Annotation of Li Shan WenXuan——One Annotation Phenomenon Which is Poles Apart with China Classics Annotation(2006)
- An Analysis of Combining and Pragmatic Motivations on "Internet Fashionable Lexicon(2003)