Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks
Citations Over TimeTop 1% of 2019 papers
Abstract
The task in referring expression comprehension is to localize the object instance in an image described by a referring expression phrased in natural language. As a language-to-vision matching task, the key to this problem is to learn a discriminative object feature that can adapt to the expression used. To avoid ambiguity, the expression normally tends to describe not only the properties of the referent itself, but also its relationships to its neighbourhood. To capture and exploit this important information we propose a graph-based, language-guided attention mechanism. Being composed of node attention component and edge attention component, the proposed graph attention mechanism explicitly represents inter-object relationships, and properties with a flexibility and power impossible with competing approaches. Furthermore, the proposed graph attention mechanism enables the comprehension decision to be visualizable and explainable. Experiments on three referring expression comprehension datasets show the advantage of the proposed approach.
Related Papers
- → AEG: Automatic Exploit Generation(2018)209 cited
- → PExy: The Other Side of Exploit Kits(2014)24 cited
- → Automated Crash Analysis and Exploit Generation with Extendable Exploit Model(2022)4 cited
- → AEMB: An Automated Exploit Mitigation Bypassing Solution(2021)5 cited
- → Formal and informal referent groups: an exploration of novices and experts in maternity services(2001)8 cited