FreeSplatter:用于稀疏视角3D重建的无姿态高斯飞溅
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
December 12, 2024
作者: Jiale Xu, Shenghua Gao, Ying Shan
cs.AI
摘要
现有的稀疏视图重建模型严重依赖准确的已知相机姿势。然而,从稀疏视图图像中推导相机外参和内参存在重大挑战。在这项工作中,我们提出了FreeSplatter,这是一个高度可扩展的前馈重建框架,能够从未校准的稀疏视图图像中生成高质量的3D高斯模型,并在短短几秒内恢复它们的相机参数。FreeSplatter建立在一个简化的变换器架构之上,包括顺序自注意力模块,促进多视图图像标记之间的信息交换,并将它们解码为逐像素的3D高斯基元。预测的高斯基元位于统一的参考框架中,实现高保真度的3D建模,并利用现成的求解器进行即时相机参数估计。为了满足物体中心和场景级重建的需求,我们在大量数据集上训练了FreeSplatter的两个模型变体。在两种情况下,FreeSplatter在重建质量和姿势估计准确性方面均优于最先进的基线模型。此外,我们展示了FreeSplatter在增强下游应用程序(如文本/图像到3D内容创建)生产力方面的潜力。
English
Existing sparse-view reconstruction models heavily rely on accurate known
camera poses. However, deriving camera extrinsics and intrinsics from
sparse-view images presents significant challenges. In this work, we present
FreeSplatter, a highly scalable, feed-forward reconstruction framework capable
of generating high-quality 3D Gaussians from uncalibrated sparse-view images
and recovering their camera parameters in mere seconds. FreeSplatter is built
upon a streamlined transformer architecture, comprising sequential
self-attention blocks that facilitate information exchange among multi-view
image tokens and decode them into pixel-wise 3D Gaussian primitives. The
predicted Gaussian primitives are situated in a unified reference frame,
allowing for high-fidelity 3D modeling and instant camera parameter estimation
using off-the-shelf solvers. To cater to both object-centric and scene-level
reconstruction, we train two model variants of FreeSplatter on extensive
datasets. In both scenarios, FreeSplatter outperforms state-of-the-art
baselines in terms of reconstruction quality and pose estimation accuracy.
Furthermore, we showcase FreeSplatter's potential in enhancing the productivity
of downstream applications, such as text/image-to-3D content creation.Summary
AI-Generated Summary