多任务检索器微调,用于领域特定和高效的RAG。
Multi-task retriever fine-tuning for domain-specific and efficient RAG
January 8, 2025
作者: Patrice Béchard, Orlando Marquez Ayala
cs.AI
摘要
在部署大型语言模型(LLMs)时,检索增强生成(RAG)已经变得无处不在,因为它可以解决典型限制,如生成虚构或过时信息。然而,在构建真实世界的RAG应用时,会出现一些实际问题。首先,检索到的信息通常是领域特定的。由于对LLMs进行微调计算成本高昂,因此更可行的是微调检索器以提高包含在LLM输入中的数据质量。其次,随着更多应用部署在同一真实世界系统中,无法承担部署单独的检索器的成本。此外,这些RAG应用通常检索不同类型的数据。我们的解决方案是在各种领域特定任务上对一个小型检索器编码器进行指导微调,以使我们能够部署一个编码器,可以服务于许多用例,从而实现低成本、可扩展性和速度。我们展示了这个编码器如何泛化到领域外设置,以及在真实企业用例中对未见过的检索任务的适用性。
English
Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying
Large Language Models (LLMs), as it can address typical limitations such as
generating hallucinated or outdated information. However, when building
real-world RAG applications, practical issues arise. First, the retrieved
information is generally domain-specific. Since it is computationally expensive
to fine-tune LLMs, it is more feasible to fine-tune the retriever to improve
the quality of the data included in the LLM input. Second, as more applications
are deployed in the same real-world system, one cannot afford to deploy
separate retrievers. Moreover, these RAG applications normally retrieve
different kinds of data. Our solution is to instruction fine-tune a small
retriever encoder on a variety of domain-specific tasks to allow us to deploy
one encoder that can serve many use cases, thereby achieving low-cost,
scalability, and speed. We show how this encoder generalizes to out-of-domain
settings as well as to an unseen retrieval task on real-world enterprise use
cases.Summary
AI-Generated Summary