ChatPaper.aiChatPaper

透過嵌入表示預熱實現高效生成模型訓練

Efficient Generative Model Training via Embedded Representation Warmup

April 14, 2025
作者: Deyuan Liu, Peng Sun, Xufeng Li, Tao Lin
cs.AI

摘要

擴散模型在生成高維數據方面表現出色,但在訓練效率和表徵質量上卻不及自監督方法。我們發現了一個關鍵瓶頸:訓練過程中未能充分利用高質量、語義豐富的表徵,這顯著減緩了收斂速度。通過系統性分析,我們揭示了一個關鍵的表徵處理區域——主要位於早期層——在此區域中,語義和結構模式的學習先於生成過程發生。為解決這一問題,我們提出了嵌入式表徵預熱(ERW),這是一個即插即用的框架,在第一階段,ERW模塊作為預熱器,用高質量的預訓練表徵初始化擴散模型的早期層。這種預熱最大限度地減輕了從零開始學習表徵的負擔,從而加速了收斂並提升了性能。我們的理論分析表明,ERW的有效性取決於其精確整合到特定的神經網絡層——稱為表徵處理區域——模型在此區域主要處理和轉換特徵表徵以供後續生成。我們進一步證實,ERW不僅加速了訓練收斂,還提升了表徵質量:實證中,我們的方法在訓練速度上比當前最先進的REPA方法快了40倍。代碼可在https://github.com/LINs-lab/ERW獲取。
English
Diffusion models excel at generating high-dimensional data but fall short in training efficiency and representation quality compared to self-supervised methods. We identify a key bottleneck: the underutilization of high-quality, semantically rich representations during training notably slows down convergence. Our systematic analysis reveals a critical representation processing region -- primarily in the early layers -- where semantic and structural pattern learning takes place before generation can occur. To address this, we propose Embedded Representation Warmup (ERW), a plug-and-play framework where in the first stage we get the ERW module serves as a warmup that initializes the early layers of the diffusion model with high-quality, pretrained representations. This warmup minimizes the burden of learning representations from scratch, thereby accelerating convergence and boosting performance. Our theoretical analysis demonstrates that ERW's efficacy depends on its precise integration into specific neural network layers -- termed the representation processing region -- where the model primarily processes and transforms feature representations for later generation. We further establish that ERW not only accelerates training convergence but also enhances representation quality: empirically, our method achieves a 40times acceleration in training speed compared to REPA, the current state-of-the-art methods. Code is available at https://github.com/LINs-lab/ERW.

Summary

AI-Generated Summary

PDF102April 16, 2025