生成式世界探索者
Generative World Explorer
November 18, 2024
作者: Taiming Lu, Tianmin Shu, Alan Yuille, Daniel Khashabi, Jieneng Chen
cs.AI
摘要
在具身體的人工智慧領域中,具有部分觀察能力的規劃是一個核心挑戰。過去的大部分研究是通過開發能夠在環境中進行物理探索以更新對世界狀態信念的代理來應對這一挑戰。相比之下,人類可以透過心智探索想像世界中看不見的部分,並通過想像的觀察來修正他們的信念。這些更新後的信念可以幫助他們做出更明智的決策,而無需始終進行對世界的物理探索。為了實現這種類似人類的能力,我們引入了生成式世界探索器(Genex),這是一個以自我為中心的世界探索框架,允許代理通過心智探索大規模的3D世界(例如城市場景),並獲取想像的觀察來更新其信念。然後,這些更新後的信念將幫助代理在當前步驟做出更明智的決策。為了訓練Genex,我們創建了一個合成的城市場景數據集,Genex-DB。我們的實驗結果表明:(1)Genex能夠在對大型虛擬物理世界進行長期探索時生成高質量且一致的觀察;(2)通過生成的觀察更新的信念可以為現有的決策模型(例如LLM代理)提供信息,從而做出更好的計劃。
English
Planning with partial observation is a central challenge in embodied AI. A
majority of prior works have tackled this challenge by developing agents that
physically explore their environment to update their beliefs about the world
state.In contrast, humans can imagine unseen parts of the world
through a mental exploration and revise their beliefs with imagined
observations. Such updated beliefs can allow them to make more informed
decisions, without necessitating the physical exploration of the world at all
times. To achieve this human-like ability, we introduce the Generative
World Explorer (Genex), an egocentric world exploration framework that allows
an agent to mentally explore a large-scale 3D world (e.g., urban scenes) and
acquire imagined observations to update its belief. This updated belief will
then help the agent to make a more informed decision at the current step. To
train Genex, we create a synthetic urban scene dataset, Genex-DB.
Our experimental results demonstrate that (1) Genex can generate
high-quality and consistent observations during long-horizon exploration of a
large virtual physical world and (2) the beliefs updated with the generated
observations can inform an existing decision-making model (e.g., an LLM agent)
to make better plans.Summary
AI-Generated Summary