0 citations0 references

TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data

2020pp. 8413–8426

Citations Over TimeTop 1% of 2020 papers

Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel

Abstract

Recent years have witnessed the burgeoning of pretrained language models (LMs) for textbased natural language (NL) understanding tasks. Such models are typically trained on free-form NL text, hence may not be suitable for tasks like semantic parsing over structured data, which require reasoning over both free-form NL questions and structured tabular data (e.g., database tables). In this paper we present TABERT, a pretrained LM that jointly learns representations for NL sentences and (semi-)structured tables. TABERT is trained on a large corpus of 26 million tables and their English contexts. In experiments, neural semantic parsers using TABERT as feature representation layers achieve new best results on the challenging weakly-supervised semantic parsing benchmark WIKITABLEQUESTIONS, while performing competitively on the text-to-SQL dataset SPIDER.

Related Papers

→ A Benchmark Test Structure for Experimental Dynamic Substructuring(2011)9 cited
→ Solutions to the Third Benchmark Control Problem(1991)3 cited
→ The Performance Validation of Linear Programming Algorithm Based on Integrated Benchmark(2010)
Theoretical Analysis of the Benchmark for Choosing Manipulative Instruments of Monetary Policies(2009)
→ Support Structure Performance Benchmark(2023)