Easy Victories and Uphill Battles in Coreference Resolution
Citations Over TimeTop 1% of 2013 papers
Abstract
Classical coreference systems encode various syntactic, discourse, and semantic phenomena explicitly, using heterogenous features computed from hand-crafted heuristics. In contrast, we present a state-of-the-art coreference system that captures such phenomena implicitly, with a small number of homogeneous feature templates examining shallow properties of mentions. Surprisingly, our features are actually more effective than the corresponding hand-engineered ones at modeling these key linguistic phenomena, allowing us to win “easy victories” without crafted heuristics. These features are successful on syntax and discourse; however, they do not model semantic compatibility well, nor do we see gains from experiments with shallow semantic features from the literature, suggesting that this approach to semantics is an “uphill battle.” Nonetheless, our final system 1 outperforms the Stanford system (Lee et al. (2011), the winner of the CoNLL 2011 shared task) by 3.5% absolute on the CoNLL metric and outperforms the IMS system (Bj¨ orkelund and Farkas (2012), the best publicly available English coreference system) by 1.9% absolute.