NitroFusion:透過動態對抗訓練實現高保真度單步擴散
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training
December 2, 2024
作者: Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou, Yi-Zhe Song
cs.AI
摘要
我們介紹了NitroFusion,這是一種徹底不同的單步擴散方法,通過動態對抗框架實現高質量生成。雖然一步方法具有顯著的速度優勢,但通常與多步方法相比會出現質量下降的問題。就像一組藝術評論家通過專注於不同方面(如構圖、色彩和技術)提供全面反饋一樣,我們的方法保持了一個大型的專業鑑別器頭部池,共同引導生成過程。每個鑑別器組在不同噪聲水平上對特定質量方面發展專業知識,提供多樣化反饋,從而實現高保真的一步生成。我們的框架結合了:(i)具有專業鑑別器組的動態鑑別器池,以提高生成質量,(ii)策略性刷新機制以防止鑑別器過度擬合,以及(iii)用於多尺度質量評估的全局-局部鑑別器頭,以及無條件/有條件訓練以實現平衡生成。此外,我們的框架獨特地支持通過自下而上的細化進行靈活部署,使用戶可以動態選擇1-4個去噪步驟,使用同一模型進行直接質量-速度權衡。通過全面的實驗,我們展示了NitroFusion在多個評估指標上明顯優於現有的單步方法,特別擅長保留細節和全局一致性。
English
We introduce NitroFusion, a fundamentally different approach to single-step
diffusion that achieves high-quality generation through a dynamic adversarial
framework. While one-step methods offer dramatic speed advantages, they
typically suffer from quality degradation compared to their multi-step
counterparts. Just as a panel of art critics provides comprehensive feedback by
specializing in different aspects like composition, color, and technique, our
approach maintains a large pool of specialized discriminator heads that
collectively guide the generation process. Each discriminator group develops
expertise in specific quality aspects at different noise levels, providing
diverse feedback that enables high-fidelity one-step generation. Our framework
combines: (i) a dynamic discriminator pool with specialized discriminator
groups to improve generation quality, (ii) strategic refresh mechanisms to
prevent discriminator overfitting, and (iii) global-local discriminator heads
for multi-scale quality assessment, and unconditional/conditional training for
balanced generation. Additionally, our framework uniquely supports flexible
deployment through bottom-up refinement, allowing users to dynamically choose
between 1-4 denoising steps with the same model for direct quality-speed
trade-offs. Through comprehensive experiments, we demonstrate that NitroFusion
significantly outperforms existing single-step methods across multiple
evaluation metrics, particularly excelling in preserving fine details and
global consistency.Summary
AI-Generated Summary