Difix3D+:利用单步扩散模型提升三维重建效果
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models
March 3, 2025
作者: Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcic, Huan Ling
cs.AI
摘要
神经辐射场(NeRF)与3D高斯溅射技术已彻底革新了三维重建及新视角合成任务。然而,从极端新视角实现照片级真实感渲染仍具挑战性,因各类表示方法中仍存在伪影。本研究中,我们提出了Difix3D+,一种创新流程,旨在通过单步扩散模型提升三维重建与新视角合成效果。该方案的核心是Difix,一个单步图像扩散模型,专门训练用于增强并消除由三维表示中欠约束区域导致的渲染新视角中的伪影。Difix在我们的流程中扮演双重关键角色。首先,在重建阶段,它用于清理从重建结果渲染出的伪训练视图,随后这些视图被蒸馏回三维空间,显著增强了欠约束区域,提升了整体三维表示质量。更为重要的是,Difix在推理阶段还充当神经增强器,有效去除因不完善的三维监督及当前重建模型能力限制而产生的残留伪影。Difix3D+作为一种通用解决方案,单一模型即兼容NeRF与3DGS两种表示方式,在保持三维一致性的同时,相较于基线模型,FID分数平均提升了两倍。
English
Neural Radiance Fields and 3D Gaussian Splatting have revolutionized 3D
reconstruction and novel-view synthesis task. However, achieving photorealistic
rendering from extreme novel viewpoints remains challenging, as artifacts
persist across representations. In this work, we introduce Difix3D+, a novel
pipeline designed to enhance 3D reconstruction and novel-view synthesis through
single-step diffusion models. At the core of our approach is Difix, a
single-step image diffusion model trained to enhance and remove artifacts in
rendered novel views caused by underconstrained regions of the 3D
representation. Difix serves two critical roles in our pipeline. First, it is
used during the reconstruction phase to clean up pseudo-training views that are
rendered from the reconstruction and then distilled back into 3D. This greatly
enhances underconstrained regions and improves the overall 3D representation
quality. More importantly, Difix also acts as a neural enhancer during
inference, effectively removing residual artifacts arising from imperfect 3D
supervision and the limited capacity of current reconstruction models. Difix3D+
is a general solution, a single model compatible with both NeRF and 3DGS
representations, and it achieves an average 2times improvement in FID score
over baselines while maintaining 3D consistency.Summary
AI-Generated Summary