Rank1:信息检索中重排序的测试时计算
Rank1: Test-Time Compute for Reranking in Information Retrieval
February 25, 2025
作者: Orion Weller, Kathryn Ricci, Eugene Yang, Andrew Yates, Dawn Lawrie, Benjamin Van Durme
cs.AI
摘要
我们推出了Rank1,这是首个利用测试时计算进行训练的重新排序模型。Rank1展示了在检索领域应用推理语言模型(如OpenAI的o1、深度求索的R1等)进行蒸馏以快速提升较小模型性能的可行性。我们收集并开源了一个包含超过60万条来自MS MARCO查询和段落R1推理轨迹的数据集。基于此数据集训练的模型展现出:(1)在高级推理和指令遵循数据集上的最先进性能;(2)由于能够响应用户输入提示,在分布外数据上表现尤为出色;(3)具备可解释的推理链,可提供给用户或基于RAG的系统。此外,我们证明了这些模型的量化版本在减少计算/内存使用的同时仍保持强劲性能。总体而言,Rank1表明,测试时计算为搜索领域带来了一种全新类型的、兼具可解释性与高性能的重新排序模型。
English
We introduce Rank1, the first reranking model trained to take advantage of
test-time compute. Rank1 demonstrates the applicability within retrieval of
using a reasoning language model (i.e. OpenAI's o1, Deepseek's R1, etc.) for
distillation in order to rapidly improve the performance of a smaller model. We
gather and open-source a dataset of more than 600,000 examples of R1 reasoning
traces from queries and passages in MS MARCO. Models trained on this dataset
show: (1) state-of-the-art performance on advanced reasoning and instruction
following datasets; (2) work remarkably well out of distribution due to the
ability to respond to user-input prompts; and (3) have explainable reasoning
chains that can be given to users or RAG-based systems. Further, we demonstrate
that quantized versions of these models retain strong performance while using
less compute/memory. Overall, Rank1 shows that test-time compute allows for a
fundamentally new type of explainable and performant reranker model for search.Summary
AI-Generated Summary