Progent:面向大语言模型代理的可编程权限控制系统
Progent: Programmable Privilege Control for LLM Agents
April 16, 2025
作者: Tianneng Shi, Jingxuan He, Zhun Wang, Linyu Wu, Hongwei Li, Wenbo Guo, Dawn Song
cs.AI
摘要
LLM代理是一种新兴的AI系统形态,其中大型语言模型(LLMs)作为核心组件,利用多样化的工具集来完成用户分配的任务。尽管它们潜力巨大,LLM代理也带来了显著的安全风险。在与外部世界交互时,它们可能遭遇攻击者的恶意指令,导致执行危险操作。解决这一问题的一个有前景的方法是实施最小权限原则:仅允许完成任务所必需的操作,同时阻止不必要的行动。然而,实现这一目标颇具挑战,因为它需要在保障安全与实用性的同时,覆盖多样化的代理场景。
我们推出了Progent,这是首个针对LLM代理的权限控制机制。其核心是一种领域特定语言,用于灵活表达在代理执行过程中应用的权限控制策略。这些策略对工具调用提供细粒度的约束,决定何时允许工具调用,并在不允许时指定备用方案。这使得代理开发者和用户能够为其特定用例定制合适的策略,并确定性地执行这些策略以确保安全。得益于其模块化设计,集成Progent不会改变代理的内部结构,仅需对代理实现进行最小化修改,从而提升了其实用性和广泛采用的潜力。为了自动化策略编写,我们利用LLMs根据用户查询生成策略,随后动态更新这些策略以增强安全性和实用性。我们广泛的评估表明,在AgentDojo、ASB和AgentPoison这三个不同场景或基准测试中,Progent在保持高实用性的同时实现了强大的安全性。此外,我们进行了深入分析,展示了其核心组件的有效性以及自动化策略生成在面对适应性攻击时的韧性。
English
LLM agents are an emerging form of AI systems where large language models
(LLMs) serve as the central component, utilizing a diverse set of tools to
complete user-assigned tasks. Despite their great potential, LLM agents pose
significant security risks. When interacting with the external world, they may
encounter malicious commands from attackers, leading to the execution of
dangerous actions. A promising way to address this is by enforcing the
principle of least privilege: allowing only essential actions for task
completion while blocking unnecessary ones. However, achieving this is
challenging, as it requires covering diverse agent scenarios while preserving
both security and utility.
We introduce Progent, the first privilege control mechanism for LLM agents.
At its core is a domain-specific language for flexibly expressing privilege
control policies applied during agent execution. These policies provide
fine-grained constraints over tool calls, deciding when tool calls are
permissible and specifying fallbacks if they are not. This enables agent
developers and users to craft suitable policies for their specific use cases
and enforce them deterministically to guarantee security. Thanks to its modular
design, integrating Progent does not alter agent internals and requires only
minimal changes to agent implementation, enhancing its practicality and
potential for widespread adoption. To automate policy writing, we leverage LLMs
to generate policies based on user queries, which are then updated dynamically
for improved security and utility. Our extensive evaluation shows that it
enables strong security while preserving high utility across three distinct
scenarios or benchmarks: AgentDojo, ASB, and AgentPoison. Furthermore, we
perform an in-depth analysis, showcasing the effectiveness of its core
components and the resilience of its automated policy generation against
adaptive attacks.Summary
AI-Generated Summary