SynCamMaster:从多个不同视角生成同步多摄像头视频

SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

December 10, 2024
作者: Jianhong Bai, Menghan Xia, Xintao Wang, Ziyang Yuan, Xiao Fu, Zuozhu Liu, Haoji Hu, Pengfei Wan, Di Zhang
cs.AI

摘要

最近视频扩散模型的进展展示了在模拟真实世界动态和保持三维一致性方面的异常能力。这一进展激发了我们对这些模型潜力的探究,以确保跨不同视角的动态一致性,这对于虚拟拍摄等应用而言是一项极具吸引力的特性。与现有方法侧重于为4D重建生成单个对象的多视图生成不同,我们的兴趣在于从任意视角生成开放世界视频,融入六自由度摄像机姿势。为实现这一目标,我们提出了一个即插即用模块,用于增强预训练的文本到视频模型,实现多摄像机视频生成,确保在不同视角下内容的一致性。具体而言,我们引入了一个多视图同步模块,以保持这些视角下的外观和几何一致性。鉴于高质量训练数据的稀缺性,我们设计了一种混合训练方案,利用多摄像机图像和单目视频来补充虚幻引擎渲染的多摄像机视频。此外,我们的方法还支持引人入胜的扩展,例如从新视角重新渲染视频。我们还发布了一个名为SynCamVideo-Dataset的多视图同步视频数据集。项目页面:https://jianhongbai.github.io/SynCamMaster/。
English
Recent advancements in video diffusion models have shown exceptional abilities in simulating real-world dynamics and maintaining 3D consistency. This progress inspires us to investigate the potential of these models to ensure dynamic consistency across various viewpoints, a highly desirable feature for applications such as virtual filming. Unlike existing methods focused on multi-view generation of single objects for 4D reconstruction, our interest lies in generating open-world videos from arbitrary viewpoints, incorporating 6 DoF camera poses. To achieve this, we propose a plug-and-play module that enhances a pre-trained text-to-video model for multi-camera video generation, ensuring consistent content across different viewpoints. Specifically, we introduce a multi-view synchronization module to maintain appearance and geometry consistency across these viewpoints. Given the scarcity of high-quality training data, we design a hybrid training scheme that leverages multi-camera images and monocular videos to supplement Unreal Engine-rendered multi-camera videos. Furthermore, our method enables intriguing extensions, such as re-rendering a video from novel viewpoints. We also release a multi-view synchronized video dataset, named SynCamVideo-Dataset. Project page: https://jianhongbai.github.io/SynCamMaster/.

Summary

AI-Generated Summary

PDF503December 12, 2024