基于单参考视角的新颖物体6D姿态估计
Novel Object 6D Pose Estimation with a Single Reference View
March 7, 2025
作者: Jian Liu, Wei Sun, Kai Zeng, Jin Zheng, Hui Yang, Lin Wang, Hossein Rahmani, Ajmal Mian
cs.AI
摘要
现有的新颖物体6D姿态估计方法通常依赖于CAD模型或密集的参考视图,这两者都难以获取。仅使用单一参考视图虽更具扩展性,但由于存在较大的姿态差异以及几何和空间信息有限,这一方法面临挑战。为解决这些问题,我们提出了一种基于单一参考视图的新颖物体6D姿态估计方法(SinRef-6D)。我们的核心思想是基于状态空间模型(SSMs)在相机坐标系中迭代建立点对点对齐。具体而言,迭代的相机空间点对点对齐能有效处理大姿态差异,而我们提出的RGB和点云SSMs能够从单一视图中捕捉长程依赖关系和空间信息,提供线性复杂度及卓越的空间建模能力。一旦在合成数据上完成预训练,SinRef-6D仅需单一参考视图即可估计新颖物体的6D姿态,无需重新训练或CAD模型。在六个流行数据集及真实世界机器人场景上的大量实验表明,尽管在更具挑战性的单一参考设置下运行,我们的方法仍能达到与基于CAD和密集参考视图方法相当的性能。代码将发布于https://github.com/CNJianLiu/SinRef-6D。
English
Existing novel object 6D pose estimation methods typically rely on CAD models
or dense reference views, which are both difficult to acquire. Using only a
single reference view is more scalable, but challenging due to large pose
discrepancies and limited geometric and spatial information. To address these
issues, we propose a Single-Reference-based novel object 6D (SinRef-6D) pose
estimation method. Our key idea is to iteratively establish point-wise
alignment in the camera coordinate system based on state space models (SSMs).
Specifically, iterative camera-space point-wise alignment can effectively
handle large pose discrepancies, while our proposed RGB and Points SSMs can
capture long-range dependencies and spatial information from a single view,
offering linear complexity and superior spatial modeling capability. Once
pre-trained on synthetic data, SinRef-6D can estimate the 6D pose of a novel
object using only a single reference view, without requiring retraining or a
CAD model. Extensive experiments on six popular datasets and real-world robotic
scenes demonstrate that we achieve on-par performance with CAD-based and dense
reference view-based methods, despite operating in the more challenging single
reference setting. Code will be released at
https://github.com/CNJianLiu/SinRef-6D.Summary
AI-Generated Summary