SkillWeaver:网络智能体可通过发现与精进技能实现自我提升
SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills
April 9, 2025
作者: Boyuan Zheng, Michael Y. Fatemi, Xiaolong Jin, Zora Zhiruo Wang, Apurva Gandhi, Yueqi Song, Yu Gu, Jayanth Srinivasa, Gaowen Liu, Graham Neubig, Yu Su
cs.AI
摘要
为了在复杂环境中生存并蓬勃发展,人类通过环境探索、将经验分层抽象为可重用技能,以及协作构建不断增长的技能库,进化出了精妙的自我提升机制。尽管近期取得了进展,自主网络代理仍缺乏关键的自我提升能力,在程序性知识抽象、技能精炼及技能组合方面面临挑战。本研究提出了SkillWeaver,一个以技能为核心的框架,使代理能够通过自主合成可重用的API技能实现自我提升。面对新网站,代理自主发现技能,执行以实践,并将实践经验提炼为稳健的API。通过迭代探索,持续扩展轻量级、即插即用的API库,显著增强代理的能力。在WebArena及真实网站上的实验验证了SkillWeaver的有效性,分别实现了31.8%和39.8%的相对成功率提升。此外,由强代理合成的API通过可转移技能显著增强了弱代理,在WebArena上带来了高达54.3%的改进。这些结果证明了将多样化的网站交互精炼为API的有效性,这些API能够在不同网络代理间无缝共享。
English
To survive and thrive in complex environments, humans have evolved
sophisticated self-improvement mechanisms through environment exploration,
hierarchical abstraction of experiences into reuseable skills, and
collaborative construction of an ever-growing skill repertoire. Despite recent
advancements, autonomous web agents still lack crucial self-improvement
capabilities, struggling with procedural knowledge abstraction, refining
skills, and skill composition. In this work, we introduce SkillWeaver, a
skill-centric framework enabling agents to self-improve by autonomously
synthesizing reusable skills as APIs. Given a new website, the agent
autonomously discovers skills, executes them for practice, and distills
practice experiences into robust APIs. Iterative exploration continually
expands a library of lightweight, plug-and-play APIs, significantly enhancing
the agent's capabilities. Experiments on WebArena and real-world websites
demonstrate the efficacy of SkillWeaver, achieving relative success rate
improvements of 31.8% and 39.8%, respectively. Additionally, APIs synthesized
by strong agents substantially enhance weaker agents through transferable
skills, yielding improvements of up to 54.3% on WebArena. These results
demonstrate the effectiveness of honing diverse website interactions into APIs,
which can be seamlessly shared among various web agents.Summary
AI-Generated Summary