生成世界探索器

Generative World Explorer

November 18, 2024
作者: Taiming Lu, Tianmin Shu, Alan Yuille, Daniel Khashabi, Jieneng Chen
cs.AI

摘要

在具身体实体的人工智能中,带有部分观测的规划是一个核心挑战。大多数先前的研究通过开发能够在环境中进行物理探索以更新其对世界状态的信念的代理来解决这一挑战。相比之下,人类可以通过心理探索想象看不见的世界部分,并通过想象的观测修订他们的信念。这种更新后的信念可以帮助他们做出更明智的决策,而无需始终进行世界的物理探索。为了实现这种类似人类的能力,我们引入了生成式世界探索器(Genex),这是一个以自我为中心的世界探索框架,允许代理在大规模的3D世界(例如城市场景)中进行心理探索,并获取想象的观测来更新其信念。然后,这种更新后的信念将帮助代理在当前步骤做出更明智的决策。为了训练Genex,我们创建了一个合成的城市场景数据集,Genex-DB。我们的实验结果表明:(1)Genex能够在长时间跨度的大型虚拟物理世界探索中生成高质量且一致的观测;(2)通过生成的观测更新的信念可以为现有的决策模型(例如LLM代理)提供信息,从而制定更好的计划。
English
Planning with partial observation is a central challenge in embodied AI. A majority of prior works have tackled this challenge by developing agents that physically explore their environment to update their beliefs about the world state.In contrast, humans can imagine unseen parts of the world through a mental exploration and revise their beliefs with imagined observations. Such updated beliefs can allow them to make more informed decisions, without necessitating the physical exploration of the world at all times. To achieve this human-like ability, we introduce the Generative World Explorer (Genex), an egocentric world exploration framework that allows an agent to mentally explore a large-scale 3D world (e.g., urban scenes) and acquire imagined observations to update its belief. This updated belief will then help the agent to make a more informed decision at the current step. To train Genex, we create a synthetic urban scene dataset, Genex-DB. Our experimental results demonstrate that (1) Genex can generate high-quality and consistent observations during long-horizon exploration of a large virtual physical world and (2) the beliefs updated with the generated observations can inform an existing decision-making model (e.g., an LLM agent) to make better plans.

Summary

AI-Generated Summary

PDF676November 19, 2024