迈向可信的GUI智能体：一项综述

摘要

由大型基础模型驱动的GUI代理能够与数字界面进行交互，从而在网页自动化、移动导航和软件测试等领域实现多种应用。然而，其日益增强的自主性引发了对其安全性、隐私保护和可靠性的重大关切。本综述从五个关键维度审视了GUI代理的可信度：安全漏洞、动态环境中的可靠性、透明性与可解释性、伦理考量以及评估方法。我们还识别了主要挑战，如对抗性攻击的脆弱性、序列决策中的级联故障模式，以及缺乏现实的评估基准。这些问题不仅阻碍了实际部署，还要求在任务成功之外采取全面的缓解策略。随着GUI代理的广泛应用，建立稳健的安全标准和负责任的开发实践至关重要。本综述通过系统性理解和未来研究，为推进可信GUI代理的发展奠定了基础。

English

GUI agents, powered by large foundation models, can interact with digital interfaces, enabling various applications in web automation, mobile navigation, and software testing. However, their increasing autonomy has raised critical concerns about their security, privacy, and safety. This survey examines the trustworthiness of GUI agents in five critical dimensions: security vulnerabilities, reliability in dynamic environments, transparency and explainability, ethical considerations, and evaluation methodologies. We also identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making, and a lack of realistic evaluation benchmarks. These issues not only hinder real-world deployment but also call for comprehensive mitigation strategies beyond task success. As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential. This survey provides a foundation for advancing trustworthy GUI agents through systematic understanding and future research.

迈向可信的GUI智能体：一项综述

Towards Trustworthy GUI Agents: A Survey

摘要

Summary

Support

Support