BOP挑戰賽2024:基於模型與無模型的6D物體姿態估計
BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation
April 3, 2025
作者: Van Nguyen Nguyen, Stephen Tyree, Andrew Guo, Mederic Fourmy, Anas Gouda, Taeyeop Lee, Sungphill Moon, Hyeontae Son, Lukas Ranftl, Jonathan Tremblay, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Stan Birchfield, Jiri Matas, Yann Labbe, Martin Sundermeyer, Tomas Hodan
cs.AI
摘要
我們介紹了BOP挑戰賽2024的評估方法、數據集及結果,這是為捕捉6D物體姿態估計及相關任務最新技術而舉辦的系列公開競賽中的第六屆。2024年,我們的目標是將BOP從實驗室般的設置過渡到現實世界場景。首先,我們引入了新的無模型任務,在這些任務中,沒有3D物體模型可用,方法需要僅從提供的參考視頻中對物體進行上線。其次,我們定義了一個新的、更實用的6D物體檢測任務,其中測試圖像中可見物體的身份不作為輸入提供。第三,我們引入了新的BOP-H3數據集,這些數據集使用高分辨率傳感器和AR/VR頭戴設備記錄,非常接近現實世界場景。BOP-H3包括3D模型和上線視頻,以支持基於模型和無模型的任務。參賽者在七個挑戰賽道上競爭,每個賽道由一個任務、物體上線設置和數據集組定義。值得注意的是,2024年用於未見物體基於模型的6D定位的最佳方法(FreeZeV2.1)在BOP-Classic-Core上的準確率比2023年的最佳方法(GenFlow)高出22%,並且僅比2023年用於已見物體的最佳方法(GPose2023)低4%,儘管速度顯著較慢(每張圖像24.9秒對比2.7秒)。對於這一任務,2024年更實用的方法是Co-op,每張圖像僅需0.8秒,比GenFlow快25倍且準確率高13%。在6D檢測上,方法的排名與6D定位相似,但運行時間更長。在未見物體的基於模型的2D檢測上,2024年的最佳方法(MUSE)相比2023年的最佳方法(CNOS)實現了21%的相對提升。然而,未見物體的2D檢測準確率仍顯著落後於已見物體的準確率(GDet2023)53%。在線評估系統保持開放,可訪問http://bop.felk.cvut.cz/。
English
We present the evaluation methodology, datasets and results of the BOP
Challenge 2024, the sixth in a series of public competitions organized to
capture the state of the art in 6D object pose estimation and related tasks. In
2024, our goal was to transition BOP from lab-like setups to real-world
scenarios. First, we introduced new model-free tasks, where no 3D object models
are available and methods need to onboard objects just from provided reference
videos. Second, we defined a new, more practical 6D object detection task where
identities of objects visible in a test image are not provided as input. Third,
we introduced new BOP-H3 datasets recorded with high-resolution sensors and
AR/VR headsets, closely resembling real-world scenarios. BOP-H3 include 3D
models and onboarding videos to support both model-based and model-free tasks.
Participants competed on seven challenge tracks, each defined by a task, object
onboarding setup, and dataset group. Notably, the best 2024 method for
model-based 6D localization of unseen objects (FreeZeV2.1) achieves 22% higher
accuracy on BOP-Classic-Core than the best 2023 method (GenFlow), and is only
4% behind the best 2023 method for seen objects (GPose2023) although being
significantly slower (24.9 vs 2.7s per image). A more practical 2024 method for
this task is Co-op which takes only 0.8s per image and is 25X faster and 13%
more accurate than GenFlow. Methods have a similar ranking on 6D detection as
on 6D localization but higher run time. On model-based 2D detection of unseen
objects, the best 2024 method (MUSE) achieves 21% relative improvement compared
to the best 2023 method (CNOS). However, the 2D detection accuracy for unseen
objects is still noticealy (-53%) behind the accuracy for seen objects
(GDet2023). The online evaluation system stays open and is available at
http://bop.felk.cvut.cz/Summary
AI-Generated Summary