具體化RAG:用於檢索和生成的通用非參數化具體化記憶
Embodied-RAG: General non-parametric Embodied Memory for Retrieval and Generation
September 26, 2024
作者: Quanting Xie, So Yeon Min, Tianyi Zhang, Aarav Bajaj, Ruslan Salakhutdinov, Matthew Johnson-Roberson, Yonatan Bisk
cs.AI
摘要
機器人在探索和學習方面沒有極限,但所有知識都需要是可搜索和可操作的。在語言研究中,檢索增強生成(RAG)已成為大規模非參數化知識的主要工具,但現有技術無法直接應用於具身域,該領域是多模態的,數據高度相關,並且感知需要抽象化。
為應對這些挑戰,我們引入了具身式RAG,這是一個框架,通過非參數化記憶系統增強了具身式代理的基礎模型,該系統能夠自主構建用於導航和語言生成的層次知識。具身式RAG處理各種環境和查詢類型的空間和語義分辨率範圍,無論是針對特定對象還是對環境氛圍的整體描述。在其核心,具身式RAG的記憶結構為語義樹,以不同細節層次存儲語言描述。這種層次組織使系統能夠在不同機器人平台上高效生成上下文敏感的輸出。我們展示了具身式RAG有效地將RAG橋接到機器人領域,成功處理了19個環境中超過200個解釋和導航查詢,凸顯了其作為具身式代理通用非參數系統的潛力。
English
There is no limit to how much a robot might explore and learn, but all of
that knowledge needs to be searchable and actionable. Within language research,
retrieval augmented generation (RAG) has become the workhouse of large-scale
non-parametric knowledge, however existing techniques do not directly transfer
to the embodied domain, which is multimodal, data is highly correlated, and
perception requires abstraction.
To address these challenges, we introduce Embodied-RAG, a framework that
enhances the foundational model of an embodied agent with a non-parametric
memory system capable of autonomously constructing hierarchical knowledge for
both navigation and language generation. Embodied-RAG handles a full range of
spatial and semantic resolutions across diverse environments and query types,
whether for a specific object or a holistic description of ambiance. At its
core, Embodied-RAG's memory is structured as a semantic forest, storing
language descriptions at varying levels of detail. This hierarchical
organization allows the system to efficiently generate context-sensitive
outputs across different robotic platforms. We demonstrate that Embodied-RAG
effectively bridges RAG to the robotics domain, successfully handling over 200
explanation and navigation queries across 19 environments, highlighting its
promise for general-purpose non-parametric system for embodied agents.Summary
AI-Generated Summary