ChatPaper.aiChatPaper

NVComposer:利用多个稀疏和未对齐图像增强生成式新视图合成

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images

December 4, 2024
作者: Lingen Li, Zhaoyang Zhang, Yaowei Li, Jiale Xu, Xiaoyu Li, Wenbo Hu, Weihao Cheng, Jinwei Gu, Tianfan Xue, Ying Shan
cs.AI

摘要

最近生成模型的进展显著提高了从多视角数据进行新视图合成(NVS)的能力。然而,现有方法依赖于外部多视角对齐过程,如明确的姿态估计或预重建,这限制了它们的灵活性和可访问性,特别是当由于视角之间的重叠不足或遮挡而导致对齐不稳定时。在本文中,我们提出了NVComposer,一种新颖的方法,消除了对明确外部对齐的需求。NVComposer通过引入两个关键组件使生成模型能够隐式推断多个条件视图之间的空间和几何关系:1)图像-姿态双流扩散模型,同时生成目标新视图和条件相机姿态;2)几何感知特征对齐模块,在训练过程中从密集立体模型中提取几何先验。大量实验证明,NVComposer在生成多视角NVS任务中实现了最先进的性能,消除了对外部对齐的依赖,从而提高了模型的可访问性。我们的方法在合成质量方面显示出显著改进,随着未定位输入视图数量的增加,突显了其对更灵活和可访问的生成NVS系统的潜力。
English
Recent advancements in generative models have significantly improved novel view synthesis (NVS) from multi-view data. However, existing methods depend on external multi-view alignment processes, such as explicit pose estimation or pre-reconstruction, which limits their flexibility and accessibility, especially when alignment is unstable due to insufficient overlap or occlusions between views. In this paper, we propose NVComposer, a novel approach that eliminates the need for explicit external alignment. NVComposer enables the generative model to implicitly infer spatial and geometric relationships between multiple conditional views by introducing two key components: 1) an image-pose dual-stream diffusion model that simultaneously generates target novel views and condition camera poses, and 2) a geometry-aware feature alignment module that distills geometric priors from dense stereo models during training. Extensive experiments demonstrate that NVComposer achieves state-of-the-art performance in generative multi-view NVS tasks, removing the reliance on external alignment and thus improving model accessibility. Our approach shows substantial improvements in synthesis quality as the number of unposed input views increases, highlighting its potential for more flexible and accessible generative NVS systems.

Summary

AI-Generated Summary

PDF193December 5, 2024