CORAL:多輪對話檢索增強生成基準。
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation
October 30, 2024
作者: Yiruo Cheng, Kelong Mao, Ziliang Zhao, Guanting Dong, Hongjin Qian, Yongkang Wu, Tetsuya Sakai, Ji-Rong Wen, Zhicheng Dou
cs.AI
摘要
檢索擴增生成(RAG)已成為增強大型語言模型(LLMs)的強大範式,通過外部知識檢索。儘管引起廣泛關注,現有學術研究主要集中在單輪RAG上,卻忽略了應對現實應用中多輪對話複雜性的重要空白。為彌補這一缺口,我們引入了CORAL,一個旨在評估RAG系統在現實多輪對話環境中的大規模基準測試。CORAL包括從維基百科自動提取的多樣資訊尋求對話,並應對開放域覆蓋、知識密集度、自由形式回應和話題轉換等關鍵挑戰。它支持對話式RAG的三個核心任務:段落檢索、回應生成和引用標記。我們提出了一個統一框架,標準化各種對話式RAG方法,並在CORAL上對這些方法進行全面評估,顯示了改進現有方法的重大機會。
English
Retrieval-Augmented Generation (RAG) has become a powerful paradigm for
enhancing large language models (LLMs) through external knowledge retrieval.
Despite its widespread attention, existing academic research predominantly
focuses on single-turn RAG, leaving a significant gap in addressing the
complexities of multi-turn conversations found in real-world applications. To
bridge this gap, we introduce CORAL, a large-scale benchmark designed to assess
RAG systems in realistic multi-turn conversational settings. CORAL includes
diverse information-seeking conversations automatically derived from Wikipedia
and tackles key challenges such as open-domain coverage, knowledge intensity,
free-form responses, and topic shifts. It supports three core tasks of
conversational RAG: passage retrieval, response generation, and citation
labeling. We propose a unified framework to standardize various conversational
RAG methods and conduct a comprehensive evaluation of these methods on CORAL,
demonstrating substantial opportunities for improving existing approaches.Summary
AI-Generated Summary