CAT4D:使用多视角视频扩散模型在4D中创造任何事物
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
November 27, 2024
作者: Rundi Wu, Ruiqi Gao, Ben Poole, Alex Trevithick, Changxi Zheng, Jonathan T. Barron, Aleksander Holynski
cs.AI
摘要
我们提出了CAT4D,一种从单目视频创建4D(动态3D)场景的方法。CAT4D利用在多种数据集上训练的多视角视频扩散模型,实现在任意指定摄像机姿势和时间戳下的新颖视角合成。结合一种新颖的采样方法,该模型可以将单目视频转换为多视角视频,通过优化可变形的3D高斯表示实现稳健的4D重建。我们展示了在新颖视角合成和动态场景重建基准上的竞争性表现,并突出了从真实或生成的视频生成4D场景的创造性能力。请查看我们的项目页面以获取结果和交互式演示:cat-4d.github.io。
English
We present CAT4D, a method for creating 4D (dynamic 3D) scenes from monocular
video. CAT4D leverages a multi-view video diffusion model trained on a diverse
combination of datasets to enable novel view synthesis at any specified camera
poses and timestamps. Combined with a novel sampling approach, this model can
transform a single monocular video into a multi-view video, enabling robust 4D
reconstruction via optimization of a deformable 3D Gaussian representation. We
demonstrate competitive performance on novel view synthesis and dynamic scene
reconstruction benchmarks, and highlight the creative capabilities for 4D scene
generation from real or generated videos. See our project page for results and
interactive demos: cat-4d.github.io.Summary
AI-Generated Summary