AlphaTablets:一种用于从单目视频中进行3D平面重建的通用平面表示方式
AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos
November 29, 2024
作者: Yuze He, Wang Zhao, Shaohui Liu, Yubin Hu, Yushi Bai, Yu-Hui Wen, Yong-Jin Liu
cs.AI
摘要
我们介绍了AlphaTablets,这是一种新颖且通用的3D平面表示,具有连续的3D表面和精确的边界划分。通过将3D平面表示为带有alpha通道的矩形,AlphaTablets结合了当前2D和3D平面表示的优势,实现了对3D平面的准确、一致和灵活建模。我们在AlphaTablets之上推导出可微的光栅化,以便将3D平面高效渲染成图像,并提出了一种新颖的自底向上管道,用于从单眼视频中重建3D平面。从2D超像素和来自预训练模型的几何线索开始,我们将3D平面初始化为AlphaTablets,并通过可微渲染对其进行优化。引入了一种有效的合并方案,以促进AlphaTablets的增长和细化。通过迭代优化和合并,我们重建了具有坚固表面和清晰边界的完整准确的3D平面。在ScanNet数据集上进行的大量实验表明,在3D平面重建方面表现出最先进的性能,突显了AlphaTablets作为各种应用的通用3D平面表示具有巨大潜力。项目页面位于:https://hyzcluster.github.io/alphatablets
English
We introduce AlphaTablets, a novel and generic representation of 3D planes
that features continuous 3D surface and precise boundary delineation. By
representing 3D planes as rectangles with alpha channels, AlphaTablets combine
the advantages of current 2D and 3D plane representations, enabling accurate,
consistent and flexible modeling of 3D planes. We derive differentiable
rasterization on top of AlphaTablets to efficiently render 3D planes into
images, and propose a novel bottom-up pipeline for 3D planar reconstruction
from monocular videos. Starting with 2D superpixels and geometric cues from
pre-trained models, we initialize 3D planes as AlphaTablets and optimize them
via differentiable rendering. An effective merging scheme is introduced to
facilitate the growth and refinement of AlphaTablets. Through iterative
optimization and merging, we reconstruct complete and accurate 3D planes with
solid surfaces and clear boundaries. Extensive experiments on the ScanNet
dataset demonstrate state-of-the-art performance in 3D planar reconstruction,
underscoring the great potential of AlphaTablets as a generic 3D plane
representation for various applications. Project page is available at:
https://hyzcluster.github.io/alphatabletsSummary
AI-Generated Summary