動作引導:利用動作軌跡控制影片生成
Motion Prompting: Controlling Video Generation with Motion Trajectories
December 3, 2024
作者: Daniel Geng, Charles Herrmann, Junhwa Hur, Forrester Cole, Serena Zhang, Tobias Pfaff, Tatiana Lopez-Guevara, Carl Doersch, Yusuf Aytar, Michael Rubinstein, Chen Sun, Oliver Wang, Andrew Owens, Deqing Sun
cs.AI
摘要
運動控制對於生成具表現力和引人入勝的視頻內容至關重要;然而,大多數現有的視頻生成模型主要依賴文本提示進行控制,這些模型難以捕捉動態動作和時間組合的微妙之處。為此,我們訓練了一個視頻生成模型,其條件是時空稀疏或密集運動軌跡。與先前的運動條件工作相比,這種靈活的表示可以編碼任意數量的軌跡、特定於物體或全局場景運動,以及時間上稀疏的運動;由於其靈活性,我們將其稱為運動提示。雖然用戶可以直接指定稀疏軌跡,但我們還展示了如何將高級用戶請求轉換為詳細的、半密集的運動提示,這個過程我們稱之為運動提示擴展。我們通過各種應用展示了我們方法的多功能性,包括攝像機和物體運動控制、與圖像“互動”、運動轉移和圖像編輯。我們的結果展示了出現的行為,如逼真的物理效果,表明運動提示對於探索視頻模型並與未來生成世界模型互動具有潛力。最後,我們進行了定量評估,進行了人類研究,並展示了強大的性能。視頻結果可在我們的網頁上查看:https://motion-prompting.github.io/
English
Motion control is crucial for generating expressive and compelling video
content; however, most existing video generation models rely mainly on text
prompts for control, which struggle to capture the nuances of dynamic actions
and temporal compositions. To this end, we train a video generation model
conditioned on spatio-temporally sparse or dense motion trajectories. In
contrast to prior motion conditioning work, this flexible representation can
encode any number of trajectories, object-specific or global scene motion, and
temporally sparse motion; due to its flexibility we refer to this conditioning
as motion prompts. While users may directly specify sparse trajectories, we
also show how to translate high-level user requests into detailed, semi-dense
motion prompts, a process we term motion prompt expansion. We demonstrate the
versatility of our approach through various applications, including camera and
object motion control, "interacting" with an image, motion transfer, and image
editing. Our results showcase emergent behaviors, such as realistic physics,
suggesting the potential of motion prompts for probing video models and
interacting with future generative world models. Finally, we evaluate
quantitatively, conduct a human study, and demonstrate strong performance.
Video results are available on our webpage: https://motion-prompting.github.io/Summary
AI-Generated Summary