Improving named entity recognition and disambiguation in news headlines
Citations Over Time
Abstract
In this paper, we present a framework for extraction and disambiguation of hyphenated and partially named entities in news headlines. The direct application of state-of-the-art named entity detection and disambiguation approaches on news headlines results in significantly degraded performance due to different headline formatting in comparison with regular text; hyphenated mentions; and partial entity mentions. In this paper, we introduce a novel framework that assists existing named entity recognition and disambiguation systems to deal with introduced challenges. In particular, we deal with hyphenated entity mentions and partial entity mentions present in news headlines. We modify the hyphenated and partial entity in a way that increases the probability of disambiguation to correct entity in knowledge base. Our framework leverages headlines of recent past to improve the entity mentions in headlines. The experimental results showed that presented framework improves the F1-score of mention detection by 12% and 9% in state-of-the-art Stanford and Illinois NER systems, whereas F1-score of disambiguation is improved by 9%, 12%, 7% and 5% in AIDA, Wikifier, TagMe, and YODIE state-of-the-art NED systems respectively.
Related Papers
- → Named entity discovery using comparable news articles(2004)77 cited
- → Improving named entity recognition and disambiguation in news headlines(2019)1 cited
- → Analysis of named entity recognition & entity linking in historical text(2016)1 cited
- → Improving named entity recognition and disambiguation in news headlines(2019)