AlphaTablets：從單眼視頻中重建3D平面的通用平面表示

摘要

我們介紹了AlphaTablets，這是一種新穎且通用的3D平面表示，具有連續的3D表面和精確的邊界劃分。通過將3D平面表示為帶有alpha通道的矩形，AlphaTablets結合了當前2D和3D平面表示的優勢，實現了對3D平面的準確、一致和靈活建模。我們在AlphaTablets之上推導出可微的光柵化，以高效地將3D平面渲染為圖像，並提出了一種從單眼視頻中重建3D平面的新型自下而上流程。從2D超像素和來自預訓練模型的幾何線索開始，我們將3D平面初始化為AlphaTablets，並通過可微渲染對其進行優化。引入了一種有效的合併方案，以促進AlphaTablets的增長和細化。通過迭代優化和合併，我們重建了具有堅固表面和清晰邊界的完整且準確的3D平面。在ScanNet數據集上進行的大量實驗表明，在3D平面重建方面表現出了最先進的性能，突顯了AlphaTablets作為各種應用的通用3D平面表示具有巨大潛力。項目頁面位於：https://hyzcluster.github.io/alphatablets

English

We introduce AlphaTablets, a novel and generic representation of 3D planes that features continuous 3D surface and precise boundary delineation. By representing 3D planes as rectangles with alpha channels, AlphaTablets combine the advantages of current 2D and 3D plane representations, enabling accurate, consistent and flexible modeling of 3D planes. We derive differentiable rasterization on top of AlphaTablets to efficiently render 3D planes into images, and propose a novel bottom-up pipeline for 3D planar reconstruction from monocular videos. Starting with 2D superpixels and geometric cues from pre-trained models, we initialize 3D planes as AlphaTablets and optimize them via differentiable rendering. An effective merging scheme is introduced to facilitate the growth and refinement of AlphaTablets. Through iterative optimization and merging, we reconstruct complete and accurate 3D planes with solid surfaces and clear boundaries. Extensive experiments on the ScanNet dataset demonstrate state-of-the-art performance in 3D planar reconstruction, underscoring the great potential of AlphaTablets as a generic 3D plane representation for various applications. Project page is available at: https://hyzcluster.github.io/alphatablets

AlphaTablets：從單眼視頻中重建3D平面的通用平面表示

AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

摘要

Summary

熱門論文

1比特LLM時代：所有大型語言模型都在1.58比特。
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

DeepSeek-R1：通過強化學習激勵LLM中的推理能力
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Qwen2.5 技術報告
Qwen2.5 Technical Report

Support

摘要

Summary

熱門論文

1比特LLM時代：所有大型語言模型都在1.58比特。The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

DeepSeek-R1：通過強化學習激勵LLM中的推理能力DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Qwen2.5 技術報告Qwen2.5 Technical Report

1比特LLM時代：所有大型語言模型都在1.58比特。
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

DeepSeek-R1：通過強化學習激勵LLM中的推理能力
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Qwen2.5 技術報告
Qwen2.5 Technical Report