NodeRAG:基于异质节点构建的图结构检索增强生成
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes
April 15, 2025
作者: Tianyang Xu, Haojie Zheng, Chengze Li, Haoxiang Chen, Yixin Liu, Ruoxi Chen, Lichao Sun
cs.AI
摘要
检索增强生成(RAG)技术赋予大型语言模型访问外部及私有语料库的能力,从而在特定领域内提供事实一致的回答。通过利用语料库的固有结构,基于图的RAG方法进一步丰富了这一过程,它们构建知识图谱索引并发挥图的结构特性。然而,当前的基于图的RAG方法在设计图结构方面往往缺乏重视。设计不当的图不仅阻碍了多样图算法的无缝集成,还导致工作流程不一致和性能下降。为了进一步释放图在RAG中的潜力,我们提出了NodeRAG,这是一个以图为核心的框架,引入了异构图结构,使得基于图的方法能够无缝且全面地融入RAG工作流程。该框架紧密贴合大型语言模型的能力,确保了端到端过程的完全一致性和高效性。通过大量实验,我们证明NodeRAG在索引时间、查询时间、存储效率以及多跳基准测试和开放式一对一评估中的问答性能上,均优于包括GraphRAG和LightRAG在内的先前方法,且检索令牌数最少。我们的GitHub仓库可见于https://github.com/Terry-Xu-666/NodeRAG。
English
Retrieval-augmented generation (RAG) empowers large language models to access
external and private corpus, enabling factually consistent responses in
specific domains. By exploiting the inherent structure of the corpus,
graph-based RAG methods further enrich this process by building a knowledge
graph index and leveraging the structural nature of graphs. However, current
graph-based RAG approaches seldom prioritize the design of graph structures.
Inadequately designed graph not only impede the seamless integration of diverse
graph algorithms but also result in workflow inconsistencies and degraded
performance. To further unleash the potential of graph for RAG, we propose
NodeRAG, a graph-centric framework introducing heterogeneous graph structures
that enable the seamless and holistic integration of graph-based methodologies
into the RAG workflow. By aligning closely with the capabilities of LLMs, this
framework ensures a fully cohesive and efficient end-to-end process. Through
extensive experiments, we demonstrate that NodeRAG exhibits performance
advantages over previous methods, including GraphRAG and LightRAG, not only in
indexing time, query time, and storage efficiency but also in delivering
superior question-answering performance on multi-hop benchmarks and open-ended
head-to-head evaluations with minimal retrieval tokens. Our GitHub repository
could be seen at https://github.com/Terry-Xu-666/NodeRAG.Summary
AI-Generated Summary