ChatPaper.aiChatPaper

Hephaestus:通过持续预训练改进大型语言模型的基本代理能力

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

February 10, 2025
作者: Yuchen Zhuang, Jingfeng Yang, Haoming Jiang, Xin Liu, Kewei Cheng, Sanket Lokegaonkar, Yifan Gao, Qing Ping, Tianyi Liu, Binxuan Huang, Zheng Li, Zhengyang Wang, Pei Chen, Ruijie Wang, Rongzhi Zhang, Nasser Zalmout, Priyanka Nigam, Bing Yin, Chao Zhang
cs.AI

摘要

由于缺乏面向代理的预训练数据,基于LLM的自主代理通常依赖复杂的提示或广泛的微调,这经常无法引入新的能力,同时又保持强大的泛化能力。我们介绍了Hephaestus-Forge,这是第一个旨在增强LLM代理的基本能力的大规模预训练语料库,涵盖了API函数调用、内在推理和规划以及适应环境反馈。Hephaestus-Forge包括103B个代理特定数据,涵盖76,537个API,包括工具文档,以介绍API函数知识,以及函数调用轨迹,以加强内在推理。为了探索有效的训练协议,我们研究了扩展定律,以确定数据混合比例中的最佳配方。通过在Hephaestus-Forge上持续预训练,Hephaestus在三个代理基准测试中表现优于小规模到中等规模的开源LLM,并与商业LLM相媲美,展示了我们的预训练语料库在增强基本代理能力和LLM对新任务或环境的泛化能力方面的有效性。
English
Due to the scarcity of agent-oriented pre-training data, LLM-based autonomous agents typically rely on complex prompting or extensive fine-tuning, which often fails to introduce new capabilities while preserving strong generalizability. We introduce Hephaestus-Forge, the first large-scale pre-training corpus designed to enhance the fundamental capabilities of LLM agents in API function calling, intrinsic reasoning and planning, and adapting to environmental feedback. Hephaestus-Forge comprises 103B agent-specific data encompassing 76,537 APIs, including both tool documentation to introduce knowledge of API functions and function calling trajectories to strengthen intrinsic reasoning. To explore effective training protocols, we investigate scaling laws to identify the optimal recipe in data mixing ratios. By continual pre-training on Hephaestus-Forge, Hephaestus outperforms small- to medium-scale open-source LLMs and rivals commercial LLMs on three agent benchmarks, demonstrating the effectiveness of our pre-training corpus in enhancing fundamental agentic capabilities and generalization of LLMs to new tasks or environments.

Summary

AI-Generated Summary

PDF182February 12, 2025