ChatPaper.aiChatPaper

FasterCache:无需训练的视频扩散模型加速与高质量

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

October 25, 2024
作者: Zhengyao Lv, Chenyang Si, Junhao Song, Zhenyu Yang, Yu Qiao, Ziwei Liu, Kwan-Yee K. Wong
cs.AI

摘要

本文介绍了一种名为\textit{FasterCache}的新型无需训练的策略,旨在加速具有高质量生成的视频扩散模型的推断过程。通过分析现有基于缓存的方法,我们发现直接重用相邻步骤特征会降低视频质量,因为会丢失微妙的变化。我们进一步对无分类器指导(CFG)的加速潜力进行了开创性调查,并揭示了同一时间步内条件和无条件特征之间的显著冗余。基于这些观察结果,我们引入了FasterCache,以大幅加速基于扩散的视频生成。我们的关键贡献包括动态特征重用策略,既保留特征差异又保持时间连续性,以及CFG-Cache,它优化了条件和无条件输出的重用,进一步提高推断速度而不影响视频质量。我们在最近的视频扩散模型上对FasterCache进行了实证评估。实验结果表明,FasterCache可以显著加速视频生成(例如,在Vchitect-2.0上加速1.67倍),同时保持视频质量与基准相当,并在推断速度和视频质量方面始终优于现有方法。
English
In this paper, we present \textit{FasterCache}, a novel training-free strategy designed to accelerate the inference of video diffusion models with high-quality generation. By analyzing existing cache-based methods, we observe that directly reusing adjacent-step features degrades video quality due to the loss of subtle variations. We further perform a pioneering investigation of the acceleration potential of classifier-free guidance (CFG) and reveal significant redundancy between conditional and unconditional features within the same timestep. Capitalizing on these observations, we introduce FasterCache to substantially accelerate diffusion-based video generation. Our key contributions include a dynamic feature reuse strategy that preserves both feature distinction and temporal continuity, and CFG-Cache which optimizes the reuse of conditional and unconditional outputs to further enhance inference speed without compromising video quality. We empirically evaluate FasterCache on recent video diffusion models. Experimental results show that FasterCache can significantly accelerate video generation (\eg 1.67times speedup on Vchitect-2.0) while keeping video quality comparable to the baseline, and consistently outperform existing methods in both inference speed and video quality.

Summary

AI-Generated Summary

PDF232November 16, 2024