3DTrajMaster:视频中多实体运动的3D轨迹生成技术
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
December 10, 2024
作者: Xiao Fu, Xian Liu, Xintao Wang, Sida Peng, Menghan Xia, Xiaoyu Shi, Ziyang Yuan, Pengfei Wan, Di Zhang, Dahua Lin
cs.AI
摘要
本文旨在操纵视频生成中的多实体三维运动。先前关于可控视频生成的方法主要利用二维控制信号来操纵物体运动,并取得了显著的合成结果。然而,二维控制信号在表达物体运动的三维特性方面存在固有的局限性。为了克服这一问题,我们引入了3DTrajMaster,这是一个强大的控制器,根据用户期望的实体六自由度姿势(位置和旋转)序列来调节三维空间中的多实体动态。我们方法的核心是一个即插即用的三维运动基础对象注入器,通过门控自注意机制将多个输入实体与它们各自的三维轨迹融合。此外,我们利用一个注入器架构来保留视频扩散先验,这对于泛化能力至关重要。为了减轻视频质量下降,我们在训练过程中引入了一个域适配器,并在推断过程中采用一个退火采样策略。为了解决缺乏合适训练数据的问题,我们构建了一个360运动数据集,首先将收集的三维人类和动物资产与GPT生成的轨迹相关联,然后在不同的三维UE平台上用12个均匀环绕摄像机捕捉它们的运动。大量实验证明,3DTrajMaster在控制多实体三维运动方面在准确性和泛化能力上取得了新的最先进水平。项目页面:http://fuxiao0719.github.io/projects/3dtrajmaster
English
This paper aims to manipulate multi-entity 3D motions in video generation.
Previous methods on controllable video generation primarily leverage 2D control
signals to manipulate object motions and have achieved remarkable synthesis
results. However, 2D control signals are inherently limited in expressing the
3D nature of object motions. To overcome this problem, we introduce
3DTrajMaster, a robust controller that regulates multi-entity dynamics in 3D
space, given user-desired 6DoF pose (location and rotation) sequences of
entities. At the core of our approach is a plug-and-play 3D-motion grounded
object injector that fuses multiple input entities with their respective 3D
trajectories through a gated self-attention mechanism. In addition, we exploit
an injector architecture to preserve the video diffusion prior, which is
crucial for generalization ability. To mitigate video quality degradation, we
introduce a domain adaptor during training and employ an annealed sampling
strategy during inference. To address the lack of suitable training data, we
construct a 360-Motion Dataset, which first correlates collected 3D human and
animal assets with GPT-generated trajectory and then captures their motion with
12 evenly-surround cameras on diverse 3D UE platforms. Extensive experiments
show that 3DTrajMaster sets a new state-of-the-art in both accuracy and
generalization for controlling multi-entity 3D motions. Project page:
http://fuxiao0719.github.io/projects/3dtrajmasterSummary
AI-Generated Summary