MotionShop:在具有混合分數引導的視頻擴散模型中的零樣式遷移

MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance

December 6, 2024
作者: Hidir Yesiltepe, Tuna Han Salih Meral, Connor Dunlop, Pinar Yanardag
cs.AI

摘要

在這項工作中,我們提出了擴散Transformer中的第一種運動轉移方法,通過混合分數引導(MSG),這是一個在擴散模型中進行運動轉移的理論基礎框架。我們的主要理論貢獻在於重新制定條件分數,將運動分數和內容分數在擴散模型中進行分解。通過將運動轉移定義為潛在能量的混合,MSG自然地保留了場景組成,並實現了創造性的場景轉換,同時保持了轉移的運動模式的完整性。這種新穎的抽樣直接在預先訓練的視頻擴散模型上運行,無需額外的訓練或微調。通過大量實驗,MSG展示了成功處理各種情景的能力,包括單個物體、多個物體和物體間的運動轉移,以及複雜的攝像機運動轉移。此外,我們還介紹了MotionBench,這是第一個運動轉移數據集,包括200個源視頻和1000個轉移動作,涵蓋單個/多個物體的轉移和複雜的攝像機運動。
English
In this work, we propose the first motion transfer approach in diffusion transformer through Mixture of Score Guidance (MSG), a theoretically-grounded framework for motion transfer in diffusion models. Our key theoretical contribution lies in reformulating conditional score to decompose motion score and content score in diffusion models. By formulating motion transfer as a mixture of potential energies, MSG naturally preserves scene composition and enables creative scene transformations while maintaining the integrity of transferred motion patterns. This novel sampling operates directly on pre-trained video diffusion models without additional training or fine-tuning. Through extensive experiments, MSG demonstrates successful handling of diverse scenarios including single object, multiple objects, and cross-object motion transfer as well as complex camera motion transfer. Additionally, we introduce MotionBench, the first motion transfer dataset consisting of 200 source videos and 1000 transferred motions, covering single/multi-object transfers, and complex camera motions.

Summary

AI-Generated Summary

PDF72December 10, 2024