ChemAgent:大型语言模型中的自更新库改进化学推理
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning
January 11, 2025
作者: Xiangru Tang, Tianyu Hu, Muyang Ye, Yanjun Shao, Xunjian Yin, Siru Ouyang, Wangchunshu Zhou, Pan Lu, Zhuosheng Zhang, Yilun Zhao, Arman Cohan, Mark Gerstein
cs.AI
摘要
化学推理通常涉及复杂的多步骤过程,需要精确计算,即使是轻微错误也可能导致连锁失败。此外,大型语言模型(LLMs)在处理特定领域的公式、准确执行推理步骤以及有效整合代码时会遇到困难,尤其是在处理化学推理任务时。为了解决这些挑战,我们提出了ChemAgent,这是一个旨在通过动态、自我更新的库提高LLMs性能的新型框架。该库通过将化学任务分解为子任务,并将这些子任务编译成结构化集合,以便将来进行查询。然后,当面临新问题时,ChemAgent从库中检索并细化相关信息,我们称之为记忆,促进有效的任务分解和解决方案的生成。我们的方法设计了三种记忆类型和一个增强库的推理组件,使LLMs能够通过经验不断改进。来自SciBench的四个化学推理数据集上的实验结果表明,ChemAgent实现了高达46%(GPT-4)的性能提升,明显优于现有方法。我们的发现表明在未来的应用中有巨大潜力,包括药物发现和材料科学等任务。我们的代码可在https://github.com/gersteinlab/chemagent找到。
English
Chemical reasoning usually involves complex, multi-step processes that demand
precise calculations, where even minor errors can lead to cascading failures.
Furthermore, large language models (LLMs) encounter difficulties handling
domain-specific formulas, executing reasoning steps accurately, and integrating
code effectively when tackling chemical reasoning tasks. To address these
challenges, we present ChemAgent, a novel framework designed to improve the
performance of LLMs through a dynamic, self-updating library. This library is
developed by decomposing chemical tasks into sub-tasks and compiling these
sub-tasks into a structured collection that can be referenced for future
queries. Then, when presented with a new problem, ChemAgent retrieves and
refines pertinent information from the library, which we call memory,
facilitating effective task decomposition and the generation of solutions. Our
method designs three types of memory and a library-enhanced reasoning
component, enabling LLMs to improve over time through experience. Experimental
results on four chemical reasoning datasets from SciBench demonstrate that
ChemAgent achieves performance gains of up to 46% (GPT-4), significantly
outperforming existing methods. Our findings suggest substantial potential for
future applications, including tasks such as drug discovery and materials
science. Our code can be found at https://github.com/gersteinlab/chemagentSummary
AI-Generated Summary