ChatPaper.aiChatPaper

APIGen-MT:基於模擬代理-人類互動的多輪數據生成代理管道

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

April 4, 2025
作者: Akshara Prabhakar, Zuxin Liu, Weiran Yao, Jianguo Zhang, Ming Zhu, Shiyu Wang, Zhiwei Liu, Tulika Awalgaonkar, Haolin Chen, Thai Hoang, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong
cs.AI

摘要

訓練高效能的多輪互動AI代理,需要捕捉真實人機互動動態的高質量數據,然而這類數據稀缺且手動收集成本高昂。我們提出了APIGen-MT,一個兩階段框架,用於生成可驗證且多樣化的多輪代理數據。在第一階段,我們的代理管道利用LLM評審委員會和迭代反饋循環,生成包含真實行動細節的任務藍圖。這些藍圖隨後通過模擬的人機互動轉化為完整的互動軌跡。我們訓練了一系列模型——xLAM-2-fc-r系列,參數量從1B到70B不等。我們的模型在tau-bench和BFCL基準測試中超越了GPT-4o和Claude 3.5等前沿模型,其中較小模型在多輪設置下尤其超越其更大版本,同時在多輪試驗中保持卓越的一致性。全面實驗證明,我們經過驗證的藍圖到細節方法產生了高質量的訓練數據,促進了更可靠、高效和能幹的代理開發。我們開源了收集的合成數據和訓練的xLAM-2-fc-r模型,以推動AI代理研究的進步。模型可在HuggingFace上獲取,網址為https://huggingface.co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4,項目網站為https://apigen-mt.github.io。
English
Training effective AI agents for multi-turn interactions requires high-quality data that captures realistic human-agent dynamics, yet such data is scarce and expensive to collect manually. We introduce APIGen-MT, a two-phase framework that generates verifiable and diverse multi-turn agent data. In the first phase, our agentic pipeline produces detailed task blueprints with ground-truth actions, leveraging a committee of LLM reviewers and iterative feedback loops. These blueprints are then transformed into complete interaction trajectories through simulated human-agent interplay. We train a family of models -- the xLAM-2-fc-r series with sizes ranging from 1B to 70B parameters. Our models outperform frontier models such as GPT-4o and Claude 3.5 on tau-bench and BFCL benchmarks, with the smaller models surpassing their larger counterparts, particularly in multi-turn settings, while maintaining superior consistency across multiple trials. Comprehensive experiments demonstrate that our verified blueprint-to-details approach yields high-quality training data, enabling the development of more reliable, efficient, and capable agents. We open-source both the synthetic data collected and the trained xLAM-2-fc-r models to advance research in AI agents. Models are available on HuggingFace at https://huggingface.co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4 and project website is https://apigen-mt.github.io

Summary

AI-Generated Summary

PDF154April 7, 2025