Towards Robust Ranker for Text Retrieval
Citations Over TimeTop 11% of 2023 papers
Abstract
A neural ranker plays an indispensable role in the de facto 'retrieval & rerank' pipeline, but its training still lags behind due to the weak negative mining during contrastive learning.Compared to retrievers boosted by selfadversarial (i.e., in-distribution) negative mining, the ranker's heavy structure suffers from query-document combinatorial explosions, so it can only resort to the negative sampled by the fast yet out-of-distribution retriever.Thereby, the moderate negatives compose ineffective contrastive learning samples, becoming the main barrier to learning a robust ranker.To alleviate this, we propose a multi-adversarial training strategy that leverages multiple retrievers as generators to challenge a ranker, where i) diverse hard negatives from a joint distribution are prone to fool the ranker for more effective adversarial learning and ii) involving extensive out-of-distribution label noises renders the ranker against each noise distribution, leading to more challenging and robust contrastive learning.To evaluate our robust ranker (dubbed R 2 ANKER), we conduct experiments in various settings on the passage retrieval benchmarks, including BM25-reranking, full-ranking, retriever distillation, etc.The empirical results verify the new state-of-the-art effectiveness of our model.
Related Papers
- → Context Disambiguation Based Semantic Web Search for Effective Information Retrieval(2011)5 cited
- → An intelligent information retrieval agent(2008)15 cited
- → A Framework for Language Resource Construction and Syntactic Analysis: Case of Arabic(2018)1 cited
- → Investigating the combination of structural and textual information about multimedia retrieval(2014)
- Method of Adaptive Semantic Web Retrieval(2011)