Fast3R: 一次前向传递实现1000多张图像的三维重建
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
January 23, 2025
作者: Jianing Yang, Alexander Sax, Kevin J. Liang, Mikael Henaff, Hao Tang, Ang Cao, Joyce Chai, Franziska Meier, Matt Feiszli
cs.AI
摘要
在计算机视觉中,多视角三维重建仍然是一个核心挑战,特别是在需要准确和可扩展的表示跨多个视角的应用中。当前领先的方法如DUSt3R采用了一种基本的成对方法,处理图像对并需要昂贵的全局对齐过程来从多个视角重建。在这项工作中,我们提出了快速三维重建(Fast3R),这是对DUSt3R的一种新颖的多视角泛化方法,通过并行处理多个视角实现了高效和可扩展的三维重建。Fast3R的基于Transformer的架构可以在单次前向传递中处理N张图像,避免了迭代对齐的需要。通过对相机姿态估计和三维重建进行大量实验,Fast3R展现出最先进的性能,显著提高了推断速度并减少了误差累积。这些结果确立了Fast3R作为多视角应用的一个强大选择,提供了增强的可扩展性,同时不会影响重建的准确性。
English
Multi-view 3D reconstruction remains a core challenge in computer vision,
particularly in applications requiring accurate and scalable representations
across diverse perspectives. Current leading methods such as DUSt3R employ a
fundamentally pairwise approach, processing images in pairs and necessitating
costly global alignment procedures to reconstruct from multiple views. In this
work, we propose Fast 3D Reconstruction (Fast3R), a novel multi-view
generalization to DUSt3R that achieves efficient and scalable 3D reconstruction
by processing many views in parallel. Fast3R's Transformer-based architecture
forwards N images in a single forward pass, bypassing the need for iterative
alignment. Through extensive experiments on camera pose estimation and 3D
reconstruction, Fast3R demonstrates state-of-the-art performance, with
significant improvements in inference speed and reduced error accumulation.
These results establish Fast3R as a robust alternative for multi-view
applications, offering enhanced scalability without compromising reconstruction
accuracy.Summary
AI-Generated Summary