ChatPaper.aiChatPaper

时间步嵌入告诉我们:是时候为视频扩散模型缓存数据了。

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

November 28, 2024
作者: Feng Liu, Shiwei Zhang, Xiaofeng Wang, Yujie Wei, Haonan Qiu, Yuzhong Zhao, Yingya Zhang, Qixiang Ye, Fang Wan
cs.AI

摘要

作为视频生成的基本支柱,扩散模型由于去噪的顺序性质而面临推理速度较慢的挑战。先前的方法通过缓存和重复使用在均匀选择的时间步长上的模型输出来加快模型的速度。然而,这种策略忽视了模型输出在不同时间步长上的差异并非均匀的事实,这妨碍了选择适当的模型输出进行缓存,导致推理效率和视觉质量之间的平衡不佳。在本研究中,我们引入了“时间步嵌入感知缓存”(TeaCache),这是一种无需训练的缓存方法,它估计并利用了不同时间步长上模型输出之间波动的差异。TeaCache不直接使用耗时的模型输出,而是专注于具有与模型输出强相关性的模型输入,而且计算成本微乎其微。TeaCache首先使用时间步嵌入调节嘈杂的输入,以确保它们的差异更好地逼近模型输出的差异。然后,TeaCache引入了一种重新缩放策略来改进估计的差异,并利用它们指示输出缓存。实验证明,TeaCache相比于Open-Sora-Plan实现了高达4.41倍的加速,而视觉质量几乎没有降低(-0.07%的Vbench分数)。
English
As a fundamental backbone for video generation, diffusion models are challenged by low inference speed due to the sequential nature of denoising. Previous methods speed up the models by caching and reusing model outputs at uniformly selected timesteps. However, such a strategy neglects the fact that differences among model outputs are not uniform across timesteps, which hinders selecting the appropriate model outputs to cache, leading to a poor balance between inference efficiency and visual quality. In this study, we introduce Timestep Embedding Aware Cache (TeaCache), a training-free caching approach that estimates and leverages the fluctuating differences among model outputs across timesteps. Rather than directly using the time-consuming model outputs, TeaCache focuses on model inputs, which have a strong correlation with the modeloutputs while incurring negligible computational cost. TeaCache first modulates the noisy inputs using the timestep embeddings to ensure their differences better approximating those of model outputs. TeaCache then introduces a rescaling strategy to refine the estimated differences and utilizes them to indicate output caching. Experiments show that TeaCache achieves up to 4.41x acceleration over Open-Sora-Plan with negligible (-0.07% Vbench score) degradation of visual quality.

Summary

AI-Generated Summary

PDF192December 2, 2024