Document Ranking with a Pretrained Sequence-to-Sequence Model
Citations Over TimeTop 1% of 2020 papers
Abstract
This work proposes the use of a pretrained sequence-to-sequence model for document ranking. Our approach is fundamentally different from a commonly adopted classificationbased formulation based on encoder-only pretrained transformer architectures such as BERT. We show how a sequence-to-sequence model can be trained to generate relevance labels as "target tokens", and how the underlying logits of these target tokens can be interpreted as relevance probabilities for ranking. Experimental results on the MS MARCO passage ranking task show that our ranking approach is superior to strong encoderonly models. On three other document retrieval test collections, we demonstrate a zeroshot transfer-based approach that outperforms previous state-of-the-art models requiring indomain cross-validation. Furthermore, we find that our approach significantly outperforms an encoder-only architecture in a data-poor setting. We investigate this observation in more detail by varying target tokens to probe the model's use of latent knowledge. Surprisingly, we find that the choice of target tokens impacts effectiveness, even for words that are closely related semantically. This finding sheds some light on why our sequence-to-sequence formulation for document ranking is effective. Code and models are available at pygaggle.ai.