FreeSplatter:針對稀疏視角3D重建的無姿勢高斯飛濺
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
December 12, 2024
作者: Jiale Xu, Shenghua Gao, Ying Shan
cs.AI
摘要
現有的稀疏視圖重建模型嚴重依賴準確的已知相機姿勢。然而,從稀疏視圖圖像中推導相機外部參數和內部參數存在顯著挑戰。在這項工作中,我們提出了FreeSplatter,這是一個高度可擴展的前饋重建框架,能夠從未校準的稀疏視圖圖像中生成高質量的3D高斯分佈,並在短短幾秒內恢復其相機參數。FreeSplatter建立在一個簡化的變壓器架構之上,包括順序自注意塊,促進多視圖圖像令牌之間的信息交換,並將它們解碼為逐像素的3D高斯基元。預測的高斯基元位於統一的參考框架中,實現高保真度的3D建模和使用現成求解器進行即時相機參數估計。為了滿足物件中心和場景級重建的需求,我們在大量數據集上訓練了FreeSplatter的兩個模型變體。在兩種情況下,FreeSplatter在重建質量和姿態估計準確性方面均優於最先進的基線。此外,我們展示了FreeSplatter在增強下游應用程序(如文本/圖像轉3D內容創建)生產力方面的潛力。
English
Existing sparse-view reconstruction models heavily rely on accurate known
camera poses. However, deriving camera extrinsics and intrinsics from
sparse-view images presents significant challenges. In this work, we present
FreeSplatter, a highly scalable, feed-forward reconstruction framework capable
of generating high-quality 3D Gaussians from uncalibrated sparse-view images
and recovering their camera parameters in mere seconds. FreeSplatter is built
upon a streamlined transformer architecture, comprising sequential
self-attention blocks that facilitate information exchange among multi-view
image tokens and decode them into pixel-wise 3D Gaussian primitives. The
predicted Gaussian primitives are situated in a unified reference frame,
allowing for high-fidelity 3D modeling and instant camera parameter estimation
using off-the-shelf solvers. To cater to both object-centric and scene-level
reconstruction, we train two model variants of FreeSplatter on extensive
datasets. In both scenarios, FreeSplatter outperforms state-of-the-art
baselines in terms of reconstruction quality and pose estimation accuracy.
Furthermore, we showcase FreeSplatter's potential in enhancing the productivity
of downstream applications, such as text/image-to-3D content creation.Summary
AI-Generated Summary