MoViE:行動式視頻編輯的擴散
MoViE: Mobile Diffusion for Video Editing
December 9, 2024
作者: Adil Karjauv, Noor Fathima, Ioannis Lelekas, Fatih Porikli, Amir Ghodrati, Amirhossein Habibian
cs.AI
摘要
最近在基於擴散的影片編輯方面取得了顯著的進展,展現了實際應用的巨大潛力。然而,這些方法仍然價格昂貴且難以在移動設備上部署。在本研究中,我們介紹了一系列優化方案,使移動影片編輯成為可能。我們在現有的圖像編輯模型基礎上進行優化,並加入了輕量級自編碼器。隨後,我們將無分類器指導蒸餾擴展到多個模態,實現了三倍的設備內加速。最後,我們通過引入一種新穎的對抗式蒸餾方案,將採樣步驟數量減少到一個,從而保留了編輯過程的可控性。總的來說,這些優化方案使得在移動設備上以每秒12幀的速度進行影片編輯成為可能,同時保持高質量。我們的研究結果可在https://qualcomm-ai-research.github.io/mobile-video-editing/ 上查閱。
English
Recent progress in diffusion-based video editing has shown remarkable
potential for practical applications. However, these methods remain
prohibitively expensive and challenging to deploy on mobile devices. In this
study, we introduce a series of optimizations that render mobile video editing
feasible. Building upon the existing image editing model, we first optimize its
architecture and incorporate a lightweight autoencoder. Subsequently, we extend
classifier-free guidance distillation to multiple modalities, resulting in a
threefold on-device speedup. Finally, we reduce the number of sampling steps to
one by introducing a novel adversarial distillation scheme which preserves the
controllability of the editing process. Collectively, these optimizations
enable video editing at 12 frames per second on mobile devices, while
maintaining high quality. Our results are available at
https://qualcomm-ai-research.github.io/mobile-video-editing/Summary
AI-Generated Summary