多任務檢索器微調，用於特定領域和高效的RAG

摘要

檢索增強生成（RAG）在部署大型語言模型（LLM）時變得普遍，因為它可以解決典型限制，如生成幻覺或過時信息。然而，在構建真實世界的RAG應用時，會出現實際問題。首先，檢索到的信息通常是特定於領域的。由於對LLM進行微調的計算成本較高，因此更可行的是微調檢索器以提高包含在LLM輸入中的數據質量。其次，隨著更多應用在同一真實世界系統中部署，無法負擔部署獨立的檢索器。此外，這些RAG應用通常檢索不同類型的數據。我們的解決方案是對各種特定於領域的任務進行指導微調小型檢索器編碼器，以使我們能夠部署一個編碼器來滿足許多用例，從而實現低成本、可擴展性和速度。我們展示了這個編碼器如何泛化到跨領域設置，以及在真實企業用例中對未見過的檢索任務的應用。

English

Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying Large Language Models (LLMs), as it can address typical limitations such as generating hallucinated or outdated information. However, when building real-world RAG applications, practical issues arise. First, the retrieved information is generally domain-specific. Since it is computationally expensive to fine-tune LLMs, it is more feasible to fine-tune the retriever to improve the quality of the data included in the LLM input. Second, as more applications are deployed in the same real-world system, one cannot afford to deploy separate retrievers. Moreover, these RAG applications normally retrieve different kinds of data. Our solution is to instruction fine-tune a small retriever encoder on a variety of domain-specific tasks to allow us to deploy one encoder that can serve many use cases, thereby achieving low-cost, scalability, and speed. We show how this encoder generalizes to out-of-domain settings as well as to an unseen retrieval task on real-world enterprise use cases.

多任務檢索器微調，用於特定領域和高效的RAG

Multi-task retriever fine-tuning for domain-specific and efficient RAG

摘要

Support