Synthetic QA Corpora Generation with Roundtrip Consistency
2019pp. 6168–6173
Citations Over TimeTop 1% of 2019 papers
Abstract
We introduce a novel method of generating synthetic question answering corpora by combining models of question generation and answer extraction, and by filtering the results to ensure roundtrip consistency. By pretraining on the resulting corpora we obtain significant improvements on SQuAD2 Our synthetic data generation models, for both question generation and answer extraction, can be fully reproduced by finetuning a publicly available BERT model We also describe a more powerful variant that does full sequence-to-sequence pretraining for question generation, obtaining exact match and F1 at less than 0.1% and 0.4% from human performance on SQuAD2.
Related Papers
- Consistency Management Strategies for Data Replication in MANET(2011)
- → Research on a Method of Using Whole Sort Timestamp Vector Method in Massive Data Update Conflicts(2012)
- A Research into Isomerous Data Consistency in Mobile DataBase(2003)
- Research of Holding Data Consistency Strategy in Remote Data Disaster Tolerance System(2009)
- → Consistency data map for multiplayer online role games(2018)