ChatPaper.aiChatPaper

基于单参考视角的新颖物体6D姿态估计

Novel Object 6D Pose Estimation with a Single Reference View

March 7, 2025
作者: Jian Liu, Wei Sun, Kai Zeng, Jin Zheng, Hui Yang, Lin Wang, Hossein Rahmani, Ajmal Mian
cs.AI

摘要

现有的新颖物体6D姿态估计方法通常依赖于CAD模型或密集的参考视图,这两者都难以获取。仅使用单一参考视图虽更具扩展性,但由于存在较大的姿态差异以及几何和空间信息有限,这一方法面临挑战。为解决这些问题,我们提出了一种基于单一参考视图的新颖物体6D姿态估计方法(SinRef-6D)。我们的核心思想是基于状态空间模型(SSMs)在相机坐标系中迭代建立点对点对齐。具体而言,迭代的相机空间点对点对齐能有效处理大姿态差异,而我们提出的RGB和点云SSMs能够从单一视图中捕捉长程依赖关系和空间信息,提供线性复杂度及卓越的空间建模能力。一旦在合成数据上完成预训练,SinRef-6D仅需单一参考视图即可估计新颖物体的6D姿态,无需重新训练或CAD模型。在六个流行数据集及真实世界机器人场景上的大量实验表明,尽管在更具挑战性的单一参考设置下运行,我们的方法仍能达到与基于CAD和密集参考视图方法相当的性能。代码将发布于https://github.com/CNJianLiu/SinRef-6D。
English
Existing novel object 6D pose estimation methods typically rely on CAD models or dense reference views, which are both difficult to acquire. Using only a single reference view is more scalable, but challenging due to large pose discrepancies and limited geometric and spatial information. To address these issues, we propose a Single-Reference-based novel object 6D (SinRef-6D) pose estimation method. Our key idea is to iteratively establish point-wise alignment in the camera coordinate system based on state space models (SSMs). Specifically, iterative camera-space point-wise alignment can effectively handle large pose discrepancies, while our proposed RGB and Points SSMs can capture long-range dependencies and spatial information from a single view, offering linear complexity and superior spatial modeling capability. Once pre-trained on synthetic data, SinRef-6D can estimate the 6D pose of a novel object using only a single reference view, without requiring retraining or a CAD model. Extensive experiments on six popular datasets and real-world robotic scenes demonstrate that we achieve on-par performance with CAD-based and dense reference view-based methods, despite operating in the more challenging single reference setting. Code will be released at https://github.com/CNJianLiu/SinRef-6D.

Summary

AI-Generated Summary

PDF22March 11, 2025