透過嵌入表示預熱實現高效生成模型訓練
Efficient Generative Model Training via Embedded Representation Warmup
April 14, 2025
作者: Deyuan Liu, Peng Sun, Xufeng Li, Tao Lin
cs.AI
摘要
擴散模型在生成高維數據方面表現出色,但在訓練效率和表徵質量上卻不及自監督方法。我們發現了一個關鍵瓶頸:訓練過程中未能充分利用高質量、語義豐富的表徵,這顯著減緩了收斂速度。通過系統性分析,我們揭示了一個關鍵的表徵處理區域——主要位於早期層——在此區域中,語義和結構模式的學習先於生成過程發生。為解決這一問題,我們提出了嵌入式表徵預熱(ERW),這是一個即插即用的框架,在第一階段,ERW模塊作為預熱器,用高質量的預訓練表徵初始化擴散模型的早期層。這種預熱最大限度地減輕了從零開始學習表徵的負擔,從而加速了收斂並提升了性能。我們的理論分析表明,ERW的有效性取決於其精確整合到特定的神經網絡層——稱為表徵處理區域——模型在此區域主要處理和轉換特徵表徵以供後續生成。我們進一步證實,ERW不僅加速了訓練收斂,還提升了表徵質量:實證中,我們的方法在訓練速度上比當前最先進的REPA方法快了40倍。代碼可在https://github.com/LINs-lab/ERW獲取。
English
Diffusion models excel at generating high-dimensional data but fall short in
training efficiency and representation quality compared to self-supervised
methods. We identify a key bottleneck: the underutilization of high-quality,
semantically rich representations during training notably slows down
convergence. Our systematic analysis reveals a critical representation
processing region -- primarily in the early layers -- where semantic and
structural pattern learning takes place before generation can occur. To address
this, we propose Embedded Representation Warmup (ERW), a plug-and-play
framework where in the first stage we get the ERW module serves as a warmup
that initializes the early layers of the diffusion model with high-quality,
pretrained representations. This warmup minimizes the burden of learning
representations from scratch, thereby accelerating convergence and boosting
performance. Our theoretical analysis demonstrates that ERW's efficacy depends
on its precise integration into specific neural network layers -- termed the
representation processing region -- where the model primarily processes and
transforms feature representations for later generation. We further establish
that ERW not only accelerates training convergence but also enhances
representation quality: empirically, our method achieves a 40times
acceleration in training speed compared to REPA, the current state-of-the-art
methods. Code is available at https://github.com/LINs-lab/ERW.Summary
AI-Generated Summary