APIGen-MT:基于模拟人机交互的多轮数据生成智能管道
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
April 4, 2025
作者: Akshara Prabhakar, Zuxin Liu, Weiran Yao, Jianguo Zhang, Ming Zhu, Shiyu Wang, Zhiwei Liu, Tulika Awalgaonkar, Haolin Chen, Thai Hoang, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong
cs.AI
摘要
训练高效的多轮交互AI智能体,需要捕捉真实人机互动动态的高质量数据,然而这类数据稀缺且手动收集成本高昂。我们推出了APIGen-MT,一个两阶段框架,用于生成可验证且多样化的多轮智能体数据。在第一阶段,我们的智能体管道通过利用LLM评审委员会和迭代反馈循环,生成包含真实动作的详细任务蓝图。随后,这些蓝图通过模拟人机互动转化为完整的交互轨迹。我们训练了一系列模型——xLAM-2-fc-r系列,参数规模从1B到70B不等。我们的模型在tau-bench和BFCL基准测试中超越了GPT-4o和Claude 3.5等前沿模型,其中较小模型在多轮设置下尤其超越其更大版本,同时在多次试验中保持卓越的一致性。全面实验证明,我们经过验证的蓝图到细节方法生成了高质量的训练数据,促进了更可靠、高效和能干的智能体开发。我们开源了收集的合成数据及训练好的xLAM-2-fc-r模型,以推动AI智能体研究。模型可在HuggingFace获取,地址为https://huggingface.co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4,项目网站为https://apigen-mt.github.io。
English
Training effective AI agents for multi-turn interactions requires
high-quality data that captures realistic human-agent dynamics, yet such data
is scarce and expensive to collect manually. We introduce APIGen-MT, a
two-phase framework that generates verifiable and diverse multi-turn agent
data. In the first phase, our agentic pipeline produces detailed task
blueprints with ground-truth actions, leveraging a committee of LLM reviewers
and iterative feedback loops. These blueprints are then transformed into
complete interaction trajectories through simulated human-agent interplay. We
train a family of models -- the xLAM-2-fc-r series with sizes ranging from 1B
to 70B parameters. Our models outperform frontier models such as GPT-4o and
Claude 3.5 on tau-bench and BFCL benchmarks, with the smaller models
surpassing their larger counterparts, particularly in multi-turn settings,
while maintaining superior consistency across multiple trials. Comprehensive
experiments demonstrate that our verified blueprint-to-details approach yields
high-quality training data, enabling the development of more reliable,
efficient, and capable agents. We open-source both the synthetic data collected
and the trained xLAM-2-fc-r models to advance research in AI agents. Models are
available on HuggingFace at
https://huggingface.co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4
and project website is https://apigen-mt.github.ioSummary
AI-Generated Summary