Five sources of bias in natural language processing
Language and Linguistics Compass2021Vol. 15(8), pp. e12432–e12432
Citations Over TimeTop 1% of 2021 papers
Abstract
Recently, there has been an increased interest in demographically grounded bias in natural language processing (NLP) applications. Much of the recent work has focused on describing bias and providing an overview of bias in a larger context. Here, we provide a simple, actionable summary of this recent work. We outline five sources where bias can occur in NLP systems: (1) the data, (2) the annotation process, (3) the input representations, (4) the models, and finally (5) the research design (or how we conceptualize our research). We explore each of the bias sources in detail in this article, including examples and links to related work, as well as potential counter-measures.
Related Papers
- 다중 사용자 환경에서 Annotation 인터페이스의 설계 및 구현(2002)
- Social Filtering 환경에서 사용자 관심사를 고려한 Annotation 디스플레이 설계 및 구현(2002)
- On the Important Content Characters about Annotation of Xiaojing by Tang Xuan_zong(2005)
- Annotation of Li Shan WenXuan——One Annotation Phenomenon Which is Poles Apart with China Classics Annotation(2006)
- A Review of Annotation of the Pedagogic Colen Corpus(2006)