Leveraging User Paraphrasing Behavior In Dialog Systems To Automatically Collect Annotations For Long-Tail Utterances
Citations Over TimeTop 11% of 2020 papers
Abstract
In large-scale commercial dialog systems, users express the same request in a wide variety of alternative ways with a long tail of less frequent alternatives. Handling the full range of this distribution is challenging, in particular when relying on manual annotations. However, the same users also provide useful implicit feedback as they often paraphrase an utterance if the dialog system failed to understand it. We propose MARUPA, a method to leverage this type of feedback by creating annotated training examples from it. MARUPA creates new data in a fully automatic way, without manual intervention or effort from annotators, and specifically for currently failing utterances. By re-training the dialog system on this new data, accuracy and coverage for longtail utterances can be improved. In experiments, we study the effectiveness of this approach in a commercial dialog system across various domains and three languages. User System MARUPA the weekend blind uh lights please ToggleOnOffDevice, Device: lights Sure. [lights turning on]
Related Papers
- → Development and Evaluation of Spoken Dialog Systems with One or Two Agents through Two Domains(2013)8 cited
- → A Multiagent-Based Technique for Dialog Management in Conversational Interfaces(2016)3 cited
- → Effect of sympathetic relation and unsympathetic relation in multi-agent spoken dialogue system(2016)3 cited
- → Architecture of a dialog system with an assistant agent(2003)2 cited
- → Domain and Subtask-Adaptive Conversational Agents to Provide an Enhanced Human-Agent Interaction(2014)1 cited