Open-RAG：使用開源大型語言模型進行增強檢索增強推理

摘要

檢索增強生成（RAG）已被證明能夠提高大型語言模型（LLMs）的事實準確性，但現有方法在有效利用檢索證據時往往受限於有限的推理能力，特別是在使用開源LLMs時。為彌補這一差距，我們引入了一個新穎的框架，Open-RAG，旨在增強開源LLMs中RAG的推理能力。我們的框架將任意的密集LLM轉換為一個能處理復雜推理任務的參數高效的稀疏專家混合（MoE）模型，包括單跳和多跳查詢。Open-RAG獨特地訓練模型來導航出現相關但具有誤導性的具有挑戰性干擾物。因此，Open-RAG利用潛在學習，動態選擇相關專家並有效整合外部知識，以獲得更準確和具有上下文相關性的回應。此外，我們提出了一種混合自適應檢索方法，以確定檢索的必要性並平衡性能增益和推理速度之間的折衷。實驗結果表明，基於Llama2-7B的Open-RAG在各種知識密集型任務中優於最先進的LLMs和RAG模型，如ChatGPT、Self-RAG和Command R+。我們在https://openragmoe.github.io/開源我們的代碼和模型。

English

Retrieval-Augmented Generation (RAG) has been shown to enhance the factual accuracy of Large Language Models (LLMs), but existing methods often suffer from limited reasoning capabilities in effectively using the retrieved evidence, particularly when using open-source LLMs. To mitigate this gap, we introduce a novel framework, Open-RAG, designed to enhance reasoning capabilities in RAG with open-source LLMs. Our framework transforms an arbitrary dense LLM into a parameter-efficient sparse mixture of experts (MoE) model capable of handling complex reasoning tasks, including both single- and multi-hop queries. Open-RAG uniquely trains the model to navigate challenging distractors that appear relevant but are misleading. As a result, Open-RAG leverages latent learning, dynamically selecting relevant experts and integrating external knowledge effectively for more accurate and contextually relevant responses. In addition, we propose a hybrid adaptive retrieval method to determine retrieval necessity and balance the trade-off between performance gain and inference speed. Experimental results show that the Llama2-7B-based Open-RAG outperforms state-of-the-art LLMs and RAG models such as ChatGPT, Self-RAG, and Command R+ in various knowledge-intensive tasks. We open-source our code and models at https://openragmoe.github.io/

Open-RAG：使用開源大型語言模型進行增強檢索增強推理

Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models

摘要

Summary

Support

Support