代码即监视器:面向反应式和主动式机器人故障检测的约束感知视觉编程。
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection
December 5, 2024
作者: Enshen Zhou, Qi Su, Cheng Chi, Zhizheng Zhang, Zhongyuan Wang, Tiejun Huang, Lu Sheng, He Wang
cs.AI
摘要
在闭环机器人系统中,自动检测和预防开放集故障至关重要。最近的研究往往难以同时在故障发生后以反应性方式识别意外故障并预防可预见的故障。为此,我们提出了“代码即监视器”(CaM),这是一种新颖的范式,利用视觉-语言模型(VLM)进行开放集反应性和主动性故障检测。我们方法的核心是将这两个任务形式化为一组统一的时空约束满足问题,并使用VLM生成的代码进行实时监视。为了增强监视的准确性和效率,我们进一步引入了抽象约束相关实体或其部分为紧凑几何元素的约束元素。这种方法提供了更大的普适性,简化了跟踪,并通过将这些元素作为视觉提示,促进了约束感知视觉编程。实验证明,与基准相比,CaM在三个模拟器和一个真实环境中实现了28.7%更高的成功率,并将执行时间在严重干扰下降低了31.8%。此外,CaM可以与开环控制策略集成,形成闭环系统,从而在拥挤场景和动态环境中实现长视距任务。
English
Automatic detection and prevention of open-set failures are crucial in
closed-loop robotic systems. Recent studies often struggle to simultaneously
identify unexpected failures reactively after they occur and prevent
foreseeable ones proactively. To this end, we propose Code-as-Monitor (CaM), a
novel paradigm leveraging the vision-language model (VLM) for both open-set
reactive and proactive failure detection. The core of our method is to
formulate both tasks as a unified set of spatio-temporal constraint
satisfaction problems and use VLM-generated code to evaluate them for real-time
monitoring. To enhance the accuracy and efficiency of monitoring, we further
introduce constraint elements that abstract constraint-related entities or
their parts into compact geometric elements. This approach offers greater
generality, simplifies tracking, and facilitates constraint-aware visual
programming by leveraging these elements as visual prompts. Experiments show
that CaM achieves a 28.7% higher success rate and reduces execution time by
31.8% under severe disturbances compared to baselines across three simulators
and a real-world setting. Moreover, CaM can be integrated with open-loop
control policies to form closed-loop systems, enabling long-horizon tasks in
cluttered scenes with dynamic environments.Summary
AI-Generated Summary