ChatPaper.aiChatPaper

LocAgent:基于图引导的大语言模型代码定位代理

LocAgent: Graph-Guided LLM Agents for Code Localization

March 12, 2025
作者: Zhaoling Chen, Xiangru Tang, Gangda Deng, Fang Wu, Jialong Wu, Zhiwei Jiang, Viktor Prasanna, Arman Cohan, Xingyao Wang
cs.AI

摘要

代码定位——精确识别代码库中需要修改的位置——是软件维护中一项基础但极具挑战性的任务。现有方法在识别相关代码段时,难以高效地导航复杂的代码库。这一挑战的核心在于如何将自然语言描述的问题与相应的代码元素有效关联,通常需要跨越层次结构和多重依赖关系进行推理。我们提出了LocAgent框架,通过基于图的表示方法来解决代码定位问题。LocAgent将代码库解析为有向异构图,构建了一个轻量级的表示,捕捉代码结构(文件、类、函数)及其依赖关系(导入、调用、继承),使大语言模型(LLM)代理能够通过强大的多跳推理有效搜索和定位相关实体。在真实世界基准测试中的实验结果表明,我们的方法显著提高了代码定位的准确性。值得注意的是,采用微调后的Qwen-2.5-Coder-Instruct-32B模型,我们的方法以大幅降低的成本(约减少86%),在文件级定位上达到了92.7%的准确率,同时将下游GitHub问题解决成功率在多尝试(Pass@10)情况下提升了12%。我们的代码已公开于https://github.com/gersteinlab/LocAgent。
English
Code localization--identifying precisely where in a codebase changes need to be made--is a fundamental yet challenging task in software maintenance. Existing approaches struggle to efficiently navigate complex codebases when identifying relevant code sections. The challenge lies in bridging natural language problem descriptions with the appropriate code elements, often requiring reasoning across hierarchical structures and multiple dependencies. We introduce LocAgent, a framework that addresses code localization through graph-based representation. By parsing codebases into directed heterogeneous graphs, LocAgent creates a lightweight representation that captures code structures (files, classes, functions) and their dependencies (imports, invocations, inheritance), enabling LLM agents to effectively search and locate relevant entities through powerful multi-hop reasoning. Experimental results on real-world benchmarks demonstrate that our approach significantly enhances accuracy in code localization. Notably, our method with the fine-tuned Qwen-2.5-Coder-Instruct-32B model achieves comparable results to SOTA proprietary models at greatly reduced cost (approximately 86% reduction), reaching up to 92.7% accuracy on file-level localization while improving downstream GitHub issue resolution success rates by 12% for multiple attempts (Pass@10). Our code is available at https://github.com/gersteinlab/LocAgent.

Summary

AI-Generated Summary

PDF52March 13, 2025