MoViE:移动视频编辑的扩散

MoViE: Mobile Diffusion for Video Editing

December 9, 2024
作者: Adil Karjauv, Noor Fathima, Ioannis Lelekas, Fatih Porikli, Amir Ghodrati, Amirhossein Habibian
cs.AI

摘要

最近在基于扩散的视频编辑方面取得了显著进展,展现出了实际应用的巨大潜力。然而,这些方法仍然价格昂贵且难以在移动设备上部署。在本研究中,我们引入了一系列优化措施,使移动视频编辑成为可能。在现有图像编辑模型的基础上,我们首先优化其架构并加入了轻量级自动编码器。随后,我们将无分类器引导蒸馏扩展到多种模态,实现了设备上三倍的加速。最后,通过引入一种新颖的对抗蒸馏方案,将采样步骤的数量减少到一步,从而保持编辑过程的可控性。总的来说,这些优化措施使得在移动设备上以每秒12帧的速度进行视频编辑成为可能,同时保持高质量。我们的研究结果可在https://qualcomm-ai-research.github.io/mobile-video-editing/ 上查阅。
English
Recent progress in diffusion-based video editing has shown remarkable potential for practical applications. However, these methods remain prohibitively expensive and challenging to deploy on mobile devices. In this study, we introduce a series of optimizations that render mobile video editing feasible. Building upon the existing image editing model, we first optimize its architecture and incorporate a lightweight autoencoder. Subsequently, we extend classifier-free guidance distillation to multiple modalities, resulting in a threefold on-device speedup. Finally, we reduce the number of sampling steps to one by introducing a novel adversarial distillation scheme which preserves the controllability of the editing process. Collectively, these optimizations enable video editing at 12 frames per second on mobile devices, while maintaining high quality. Our results are available at https://qualcomm-ai-research.github.io/mobile-video-editing/

Summary

AI-Generated Summary

PDF182December 11, 2024