在3D中查找任何部分
Find Any Part in 3D
November 20, 2024
作者: Ziqi Ma, Yisong Yue, Georgia Gkioxari
cs.AI
摘要
我們研究在3D中的開放世界部分分割:基於任何文本查詢,在任何物體中分割任何部分。先前的方法在物體類別和部分詞彙方面存在限制。人工智慧的最新進展展示了在2D中有效的開放世界識別能力。受到這一進展的啟發,我們提出了一個開放世界、直接預測模型,用於3D部分分割,可以零樣本應用於任何物體。我們的方法名為Find3D,通過在互聯網上的大規模3D資產上訓練一個通用類別的點嵌入模型,而無需任何人類標註。它結合了一個由基礎模型驅動的數據引擎,用於標註數據,以及對比訓練方法。我們在多個數據集上實現了強大的性能和泛化能力,相較於次佳方法,mIoU提高了多達3倍。我們的模型比現有基準模型快6倍至300倍以上。為了鼓勵對通用類別開放世界3D部分分割的研究,我們還釋出了一個針對通用物體和部分的基準測試。項目網站:https://ziqi-ma.github.io/find3dsite/
English
We study open-world part segmentation in 3D: segmenting any part in any
object based on any text query. Prior methods are limited in object categories
and part vocabularies. Recent advances in AI have demonstrated effective
open-world recognition capabilities in 2D. Inspired by this progress, we
propose an open-world, direct-prediction model for 3D part segmentation that
can be applied zero-shot to any object. Our approach, called Find3D, trains a
general-category point embedding model on large-scale 3D assets from the
internet without any human annotation. It combines a data engine, powered by
foundation models for annotating data, with a contrastive training method. We
achieve strong performance and generalization across multiple datasets, with up
to a 3x improvement in mIoU over the next best method. Our model is 6x to over
300x faster than existing baselines. To encourage research in general-category
open-world 3D part segmentation, we also release a benchmark for general
objects and parts. Project website: https://ziqi-ma.github.io/find3dsite/Summary
AI-Generated Summary