時間步嵌入告訴我們:是時候為視頻擴散模型緩存了。

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

November 28, 2024
作者: Feng Liu, Shiwei Zhang, Xiaofeng Wang, Yujie Wei, Haonan Qiu, Yuzhong Zhao, Yingya Zhang, Qixiang Ye, Fang Wan
cs.AI

摘要

作為視頻生成的基本支柱,擴散模型因去噪的序列性質而面臨低推理速度的挑戰。先前的方法通過在均勻選擇的時間步驟緩存並重複使用模型輸出來加快模型速度。然而,這種策略忽略了模型輸出在不同時間步驟之間的差異並不均勻的事實,這妨礙了選擇適當的模型輸出進行緩存,導致推理效率和視覺質量之間的平衡不佳。在本研究中,我們引入了「時間步嵌入感知緩存」(TeaCache),這是一種無需訓練的緩存方法,它估計並利用模型輸出在不同時間步驟之間波動的差異。TeaCache不直接使用耗時的模型輸出,而是專注於具有與模型輸出強烈相關性的模型輸入,同時幾乎不帶來計算成本。TeaCache首先使用時間步嵌入調節噪聲輸入,以確保它們的差異更好地近似模型輸出的差異。然後,TeaCache引入一種重新縮放策略來優化估計的差異,並利用它們來指示輸出緩存。實驗表明,TeaCache相對於Open-Sora-Plan實現了高達4.41倍的加速,同時視覺質量幾乎沒有下降(-0.07%的Vbench分數)。
English
As a fundamental backbone for video generation, diffusion models are challenged by low inference speed due to the sequential nature of denoising. Previous methods speed up the models by caching and reusing model outputs at uniformly selected timesteps. However, such a strategy neglects the fact that differences among model outputs are not uniform across timesteps, which hinders selecting the appropriate model outputs to cache, leading to a poor balance between inference efficiency and visual quality. In this study, we introduce Timestep Embedding Aware Cache (TeaCache), a training-free caching approach that estimates and leverages the fluctuating differences among model outputs across timesteps. Rather than directly using the time-consuming model outputs, TeaCache focuses on model inputs, which have a strong correlation with the modeloutputs while incurring negligible computational cost. TeaCache first modulates the noisy inputs using the timestep embeddings to ensure their differences better approximating those of model outputs. TeaCache then introduces a rescaling strategy to refine the estimated differences and utilizes them to indicate output caching. Experiments show that TeaCache achieves up to 4.41x acceleration over Open-Sora-Plan with negligible (-0.07% Vbench score) degradation of visual quality.

Summary

AI-Generated Summary

PDF172December 2, 2024