搜索-o1:主体搜索增強型大型推理模型
Search-o1: Agentic Search-Enhanced Large Reasoning Models
January 9, 2025
作者: Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, Zhicheng Dou
cs.AI
摘要
大型推理模型(LRMs)如 OpenAI-o1 通過大規模強化學習展示了令人印象深刻的長步驟推理能力。然而,它們的延伸推理過程常常受到知識不足的困擾,導致頻繁的不確定性和潛在的錯誤。為了解決這一限制,我們引入了 Search-o1,一個增強 LRMs 的框架,其中包括一個增強型檢索增強生成(RAG)機制和一個用於精煉檢索文檔的“文檔中的推理”模塊。Search-o1 將一個主動式搜索工作流整合到推理過程中,使 LRMs 在遇到不確定的知識點時能夠動態檢索外部知識。此外,由於檢索文檔的冗長性,我們設計了一個獨立的“文檔中的推理”模塊,在將信息注入推理鏈之前對檢索到的信息進行深入分析,以減少噪音並保持連貫的推理流程。在科學、數學和編碼等複雜推理任務以及六個開放領域問答基準測試中進行了大量實驗,證明了 Search-o1 的強大性能。這種方法增強了 LRMs 在複雜推理任務中的可信度和應用性,為更可靠和多功能的智能系統鋪平了道路。代碼可在 https://github.com/sunnynexus/Search-o1 找到。
English
Large reasoning models (LRMs) like OpenAI-o1 have demonstrated impressive
long stepwise reasoning capabilities through large-scale reinforcement
learning. However, their extended reasoning processes often suffer from
knowledge insufficiency, leading to frequent uncertainties and potential
errors. To address this limitation, we introduce Search-o1, a
framework that enhances LRMs with an agentic retrieval-augmented generation
(RAG) mechanism and a Reason-in-Documents module for refining retrieved
documents. Search-o1 integrates an agentic search workflow into the reasoning
process, enabling dynamic retrieval of external knowledge when LRMs encounter
uncertain knowledge points. Additionally, due to the verbose nature of
retrieved documents, we design a separate Reason-in-Documents module to deeply
analyze the retrieved information before injecting it into the reasoning chain,
minimizing noise and preserving coherent reasoning flow. Extensive experiments
on complex reasoning tasks in science, mathematics, and coding, as well as six
open-domain QA benchmarks, demonstrate the strong performance of Search-o1.
This approach enhances the trustworthiness and applicability of LRMs in complex
reasoning tasks, paving the way for more reliable and versatile intelligent
systems. The code is available at
https://github.com/sunnynexus/Search-o1.Summary
AI-Generated Summary