CLEA:闭环具身智能体,用于增强动态环境中的任务执行能力
CLEA: Closed-Loop Embodied Agent for Enhancing Task Execution in Dynamic Environments
March 2, 2025
作者: Mingcong Lei, Ge Wang, Yiming Zhao, Zhixin Mai, Qing Zhao, Yao Guo, Zhen Li, Shuguang Cui, Yatong Han, Jinke Ren
cs.AI
摘要
大型语言模型(LLMs)在通过语义推理对复杂任务进行层次分解方面展现出卓越能力。然而,其在具身系统中的应用面临确保子任务序列可靠执行及实现长期任务一次性成功的挑战。为应对动态环境中的这些局限,我们提出了闭环具身代理(CLEA)——一种创新架构,集成了四个功能解耦的专用开源LLM,用于闭环任务管理。该框架具备两大核心创新:(1) 交互式任务规划器,基于环境记忆动态生成可执行子任务;(2) 多模态执行评判器,采用评估框架对行动可行性进行概率评估,当环境扰动超出预设阈值时触发层次化重规划机制。为验证CLEA的有效性,我们在真实环境中进行了实验,使用两台异构机器人执行物体搜索、操作及搜索-操作一体化任务。在12次任务试验中,CLEA相较于基线模型,成功率提升了67.3%,任务完成率提高了52.8%。这些结果表明,CLEA显著增强了动态环境中任务规划与执行的鲁棒性。
English
Large Language Models (LLMs) exhibit remarkable capabilities in the
hierarchical decomposition of complex tasks through semantic reasoning.
However, their application in embodied systems faces challenges in ensuring
reliable execution of subtask sequences and achieving one-shot success in
long-term task completion. To address these limitations in dynamic
environments, we propose Closed-Loop Embodied Agent (CLEA) -- a novel
architecture incorporating four specialized open-source LLMs with functional
decoupling for closed-loop task management. The framework features two core
innovations: (1) Interactive task planner that dynamically generates executable
subtasks based on the environmental memory, and (2) Multimodal execution critic
employing an evaluation framework to conduct a probabilistic assessment of
action feasibility, triggering hierarchical re-planning mechanisms when
environmental perturbations exceed preset thresholds. To validate CLEA's
effectiveness, we conduct experiments in a real environment with manipulable
objects, using two heterogeneous robots for object search, manipulation, and
search-manipulation integration tasks. Across 12 task trials, CLEA outperforms
the baseline model, achieving a 67.3% improvement in success rate and a 52.8%
increase in task completion rate. These results demonstrate that CLEA
significantly enhances the robustness of task planning and execution in dynamic
environments.Summary
AI-Generated Summary