Rank1：信息检索中重排序的测试时计算

摘要

我们推出了Rank1，这是首个利用测试时计算进行训练的重新排序模型。Rank1展示了在检索领域应用推理语言模型（如OpenAI的o1、深度求索的R1等）进行蒸馏以快速提升较小模型性能的可行性。我们收集并开源了一个包含超过60万条来自MS MARCO查询和段落R1推理轨迹的数据集。基于此数据集训练的模型展现出：（1）在高级推理和指令遵循数据集上的最先进性能；（2）由于能够响应用户输入提示，在分布外数据上表现尤为出色；（3）具备可解释的推理链，可提供给用户或基于RAG的系统。此外，我们证明了这些模型的量化版本在减少计算/内存使用的同时仍保持强劲性能。总体而言，Rank1表明，测试时计算为搜索领域带来了一种全新类型的、兼具可解释性与高性能的重新排序模型。

English

We introduce Rank1, the first reranking model trained to take advantage of test-time compute. Rank1 demonstrates the applicability within retrieval of using a reasoning language model (i.e. OpenAI's o1, Deepseek's R1, etc.) for distillation in order to rapidly improve the performance of a smaller model. We gather and open-source a dataset of more than 600,000 examples of R1 reasoning traces from queries and passages in MS MARCO. Models trained on this dataset show: (1) state-of-the-art performance on advanced reasoning and instruction following datasets; (2) work remarkably well out of distribution due to the ability to respond to user-input prompts; and (3) have explainable reasoning chains that can be given to users or RAG-based systems. Further, we demonstrate that quantized versions of these models retain strong performance while using less compute/memory. Overall, Rank1 shows that test-time compute allows for a fundamentally new type of explainable and performant reranker model for search.

Rank1：信息检索中重排序的测试时计算

Rank1: Test-Time Compute for Reranking in Information Retrieval

摘要

Summary

Support

Support