动作引导:使用运动轨迹控制视频生成
Motion Prompting: Controlling Video Generation with Motion Trajectories
December 3, 2024
作者: Daniel Geng, Charles Herrmann, Junhwa Hur, Forrester Cole, Serena Zhang, Tobias Pfaff, Tatiana Lopez-Guevara, Carl Doersch, Yusuf Aytar, Michael Rubinstein, Chen Sun, Oliver Wang, Andrew Owens, Deqing Sun
cs.AI
摘要
运动控制对于生成富有表现力和引人入胜的视频内容至关重要;然而,大多数现有的视频生成模型主要依赖文本提示进行控制,这些模型难以捕捉动态动作和时间组合的微妙之处。为此,我们训练了一个视频生成模型,其条件是时空稀疏或密集的运动轨迹。与先前的运动条件工作相比,这种灵活的表示可以编码任意数量的轨迹、特定于对象或全局场景运动,以及时间上的稀疏运动;由于其灵活性,我们将这种条件称为运动提示。虽然用户可以直接指定稀疏轨迹,但我们还展示了如何将高级用户请求转化为详细的、半密集的运动提示,这个过程我们称之为运动提示扩展。我们通过各种应用展示了我们方法的多功能性,包括摄像机和物体运动控制,与图像的“互动”,运动转移和图像编辑。我们的结果展示了出现的行为,如逼真的物理效果,表明了运动提示探索视频模型和与未来生成世界模型互动的潜力。最后,我们进行了定量评估,进行了人类研究,并展示了强大的性能。视频结果可在我们的网页上找到:https://motion-prompting.github.io/
English
Motion control is crucial for generating expressive and compelling video
content; however, most existing video generation models rely mainly on text
prompts for control, which struggle to capture the nuances of dynamic actions
and temporal compositions. To this end, we train a video generation model
conditioned on spatio-temporally sparse or dense motion trajectories. In
contrast to prior motion conditioning work, this flexible representation can
encode any number of trajectories, object-specific or global scene motion, and
temporally sparse motion; due to its flexibility we refer to this conditioning
as motion prompts. While users may directly specify sparse trajectories, we
also show how to translate high-level user requests into detailed, semi-dense
motion prompts, a process we term motion prompt expansion. We demonstrate the
versatility of our approach through various applications, including camera and
object motion control, "interacting" with an image, motion transfer, and image
editing. Our results showcase emergent behaviors, such as realistic physics,
suggesting the potential of motion prompts for probing video models and
interacting with future generative world models. Finally, we evaluate
quantitatively, conduct a human study, and demonstrate strong performance.
Video results are available on our webpage: https://motion-prompting.github.io/Summary
AI-Generated Summary