ChatPaper.aiChatPaper

ReZero:通过“再试一次”提升大语言模型的搜索能力

ReZero: Enhancing LLM search ability by trying one-more-time

April 15, 2025
作者: Alan Dao, Thinh Le
cs.AI

摘要

检索增强生成(RAG)技术提升了大型语言模型(LLM)在知识密集型任务中的表现,但其效果高度依赖于初始搜索查询的质量。现有方法多采用强化学习(RL),通常聚焦于查询构建或结果推理,而未能明确鼓励在搜索失败后继续尝试。我们提出了ReZero(重试归零),一种新颖的RL框架,它直接奖励在初次搜索未果后重新尝试查询的行为。这一机制激励LLM探索替代查询,而非过早终止。ReZero展现了显著改进,达到了46.88%的准确率,相较于25%的基准线。通过奖励持续性,ReZero增强了LLM在复杂信息检索场景中的鲁棒性,尤其是在初始查询可能不足的情况下。
English
Retrieval-Augmented Generation (RAG) improves Large Language Model (LLM) performance on knowledge-intensive tasks but depends heavily on initial search query quality. Current methods, often using Reinforcement Learning (RL), typically focus on query formulation or reasoning over results, without explicitly encouraging persistence after a failed search. We introduce ReZero (Retry-Zero), a novel RL framework that directly rewards the act of retrying a search query following an initial unsuccessful attempt. This incentivizes the LLM to explore alternative queries rather than prematurely halting. ReZero demonstrates significant improvement, achieving 46.88% accuracy compared to a 25% baseline. By rewarding persistence, ReZero enhances LLM robustness in complex information-seeking scenarios where initial queries may prove insufficient.

Summary

AI-Generated Summary

PDF142April 16, 2025