多任務檢索器微調,用於特定領域和高效的RAG
Multi-task retriever fine-tuning for domain-specific and efficient RAG
January 8, 2025
作者: Patrice Béchard, Orlando Marquez Ayala
cs.AI
摘要
檢索增強生成(RAG)在部署大型語言模型(LLM)時變得普遍,因為它可以解決典型限制,如生成幻覺或過時信息。然而,在構建真實世界的RAG應用時,會出現實際問題。首先,檢索到的信息通常是特定於領域的。由於對LLM進行微調的計算成本較高,因此更可行的是微調檢索器以提高包含在LLM輸入中的數據質量。其次,隨著更多應用在同一真實世界系統中部署,無法負擔部署獨立的檢索器。此外,這些RAG應用通常檢索不同類型的數據。我們的解決方案是對各種特定於領域的任務進行指導微調小型檢索器編碼器,以使我們能夠部署一個編碼器來滿足許多用例,從而實現低成本、可擴展性和速度。我們展示了這個編碼器如何泛化到跨領域設置,以及在真實企業用例中對未見過的檢索任務的應用。
English
Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying
Large Language Models (LLMs), as it can address typical limitations such as
generating hallucinated or outdated information. However, when building
real-world RAG applications, practical issues arise. First, the retrieved
information is generally domain-specific. Since it is computationally expensive
to fine-tune LLMs, it is more feasible to fine-tune the retriever to improve
the quality of the data included in the LLM input. Second, as more applications
are deployed in the same real-world system, one cannot afford to deploy
separate retrievers. Moreover, these RAG applications normally retrieve
different kinds of data. Our solution is to instruction fine-tune a small
retriever encoder on a variety of domain-specific tasks to allow us to deploy
one encoder that can serve many use cases, thereby achieving low-cost,
scalability, and speed. We show how this encoder generalizes to out-of-domain
settings as well as to an unseen retrieval task on real-world enterprise use
cases.Summary
AI-Generated Summary