CORAL：多輪對話檢索增強生成基準。

摘要

檢索擴增生成（RAG）已成為增強大型語言模型（LLMs）的強大範式，通過外部知識檢索。儘管引起廣泛關注，現有學術研究主要集中在單輪RAG上，卻忽略了應對現實應用中多輪對話複雜性的重要空白。為彌補這一缺口，我們引入了CORAL，一個旨在評估RAG系統在現實多輪對話環境中的大規模基準測試。CORAL包括從維基百科自動提取的多樣資訊尋求對話，並應對開放域覆蓋、知識密集度、自由形式回應和話題轉換等關鍵挑戰。它支持對話式RAG的三個核心任務：段落檢索、回應生成和引用標記。我們提出了一個統一框架，標準化各種對話式RAG方法，並在CORAL上對這些方法進行全面評估，顯示了改進現有方法的重大機會。

English

Retrieval-Augmented Generation (RAG) has become a powerful paradigm for enhancing large language models (LLMs) through external knowledge retrieval. Despite its widespread attention, existing academic research predominantly focuses on single-turn RAG, leaving a significant gap in addressing the complexities of multi-turn conversations found in real-world applications. To bridge this gap, we introduce CORAL, a large-scale benchmark designed to assess RAG systems in realistic multi-turn conversational settings. CORAL includes diverse information-seeking conversations automatically derived from Wikipedia and tackles key challenges such as open-domain coverage, knowledge intensity, free-form responses, and topic shifts. It supports three core tasks of conversational RAG: passage retrieval, response generation, and citation labeling. We propose a unified framework to standardize various conversational RAG methods and conduct a comprehensive evaluation of these methods on CORAL, demonstrating substantial opportunities for improving existing approaches.

CORAL：多輪對話檢索增強生成基準。

CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

摘要

Summary

Support

Support