CowPilot:自主和人-代理协作网络导航框架
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation
January 28, 2025
作者: Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou, Jeffrey P. Bigham, Graham Neubig
cs.AI
摘要
尽管许多关于网络代理的研究强调了代理可以自主代表用户执行任务的前景,但实际上,在现实世界的情境中,代理通常在复杂任务和建模用户偏好方面表现不佳。这为人类与代理合作并有效利用代理的能力提供了机会。我们提出了CowPilot,一个支持自主以及人-代理协作网络导航的框架,并通过任务成功和任务效率进行评估。CowPilot通过允许代理提出下一步行动来减少人类需要执行的步骤数量,同时用户可以暂停、拒绝或采取替代行动。在执行过程中,用户可以通过覆盖建议或在需要时恢复代理控制来与代理交错执行操作。我们在五个常见网站上进行了案例研究,发现人-代理协作模式实现了95%的最高成功率,仅需要人类执行总步骤的15.2%。即使在任务执行过程中有人类干预,代理也能独立成功推动高达一半的任务成功率。CowPilot可以作为跨网站数据收集和代理评估的有用工具,我们相信这将促进关于用户和代理如何共同工作的研究。视频演示可在https://oaishi.github.io/cowpilot.html 上找到。
English
While much work on web agents emphasizes the promise of autonomously
performing tasks on behalf of users, in reality, agents often fall short on
complex tasks in real-world contexts and modeling user preference. This
presents an opportunity for humans to collaborate with the agent and leverage
the agent's capabilities effectively. We propose CowPilot, a framework
supporting autonomous as well as human-agent collaborative web navigation, and
evaluation across task success and task efficiency. CowPilot reduces the number
of steps humans need to perform by allowing agents to propose next steps, while
users are able to pause, reject, or take alternative actions. During execution,
users can interleave their actions with the agent by overriding suggestions or
resuming agent control when needed. We conducted case studies on five common
websites and found that the human-agent collaborative mode achieves the highest
success rate of 95% while requiring humans to perform only 15.2% of the total
steps. Even with human interventions during task execution, the agent
successfully drives up to half of task success on its own. CowPilot can serve
as a useful tool for data collection and agent evaluation across websites,
which we believe will enable research in how users and agents can work
together. Video demonstrations are available at
https://oaishi.github.io/cowpilot.htmlSummary
AI-Generated Summary