3DTrajMaster:在影片生成中掌握多實體運動的3D軌跡
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
December 10, 2024
作者: Xiao Fu, Xian Liu, Xintao Wang, Sida Peng, Menghan Xia, Xiaoyu Shi, Ziyang Yuan, Pengfei Wan, Di Zhang, Dahua Lin
cs.AI
摘要
本文旨在操控多實體的三維運動以生成影片。先前在可控影片生成方面的方法主要利用二維控制信號來操控物體運動,並取得了顯著的合成結果。然而,二維控制信號在表達物體運動的三維性質方面存在固有的限制。為了克服這個問題,我們引入了3DTrajMaster,這是一個強大的控制器,根據用戶所需的實體六自由度姿勢(位置和旋轉)序列來調節三維空間中的多實體動態。我們方法的核心是一個即插即用的三維運動基礎物體注入器,通過閘控自注意機制將多個輸入實體與它們各自的三維軌跡融合在一起。此外,我們利用一個注入器架構來保留影片擴散先驗,這對於泛化能力至關重要。為了減輕影片質量下降問題,我們在訓練期間引入了一個域適配器,並在推論期間採用一種退火取樣策略。為了解決缺乏合適訓練數據的問題,我們構建了一個360-Motion數據集,首先將收集的三維人類和動物資產與GPT生成的軌跡相關聯,然後在不同的三維虛擬環境平台上使用12個均勻分布的攝像機捕捉它們的運動。大量實驗表明,3DTrajMaster在控制多實體的三維運動方面在準確性和泛化能力上設立了新的技術水準。項目頁面:http://fuxiao0719.github.io/projects/3dtrajmaster
English
This paper aims to manipulate multi-entity 3D motions in video generation.
Previous methods on controllable video generation primarily leverage 2D control
signals to manipulate object motions and have achieved remarkable synthesis
results. However, 2D control signals are inherently limited in expressing the
3D nature of object motions. To overcome this problem, we introduce
3DTrajMaster, a robust controller that regulates multi-entity dynamics in 3D
space, given user-desired 6DoF pose (location and rotation) sequences of
entities. At the core of our approach is a plug-and-play 3D-motion grounded
object injector that fuses multiple input entities with their respective 3D
trajectories through a gated self-attention mechanism. In addition, we exploit
an injector architecture to preserve the video diffusion prior, which is
crucial for generalization ability. To mitigate video quality degradation, we
introduce a domain adaptor during training and employ an annealed sampling
strategy during inference. To address the lack of suitable training data, we
construct a 360-Motion Dataset, which first correlates collected 3D human and
animal assets with GPT-generated trajectory and then captures their motion with
12 evenly-surround cameras on diverse 3D UE platforms. Extensive experiments
show that 3DTrajMaster sets a new state-of-the-art in both accuracy and
generalization for controlling multi-entity 3D motions. Project page:
http://fuxiao0719.github.io/projects/3dtrajmasterSummary
AI-Generated Summary