NitroFusion：透過動態對抗訓練實現高保真度單步擴散

摘要

我們介紹了NitroFusion，這是一種徹底不同的單步擴散方法，通過動態對抗框架實現高質量生成。雖然一步方法具有顯著的速度優勢，但通常與多步方法相比會出現質量下降的問題。就像一組藝術評論家通過專注於不同方面（如構圖、色彩和技術）提供全面反饋一樣，我們的方法保持了一個大型的專業鑑別器頭部池，共同引導生成過程。每個鑑別器組在不同噪聲水平上對特定質量方面發展專業知識，提供多樣化反饋，從而實現高保真的一步生成。我們的框架結合了：（i）具有專業鑑別器組的動態鑑別器池，以提高生成質量，（ii）策略性刷新機制以防止鑑別器過度擬合，以及（iii）用於多尺度質量評估的全局-局部鑑別器頭，以及無條件/有條件訓練以實現平衡生成。此外，我們的框架獨特地支持通過自下而上的細化進行靈活部署，使用戶可以動態選擇1-4個去噪步驟，使用同一模型進行直接質量-速度權衡。通過全面的實驗，我們展示了NitroFusion在多個評估指標上明顯優於現有的單步方法，特別擅長保留細節和全局一致性。

English

We introduce NitroFusion, a fundamentally different approach to single-step diffusion that achieves high-quality generation through a dynamic adversarial framework. While one-step methods offer dramatic speed advantages, they typically suffer from quality degradation compared to their multi-step counterparts. Just as a panel of art critics provides comprehensive feedback by specializing in different aspects like composition, color, and technique, our approach maintains a large pool of specialized discriminator heads that collectively guide the generation process. Each discriminator group develops expertise in specific quality aspects at different noise levels, providing diverse feedback that enables high-fidelity one-step generation. Our framework combines: (i) a dynamic discriminator pool with specialized discriminator groups to improve generation quality, (ii) strategic refresh mechanisms to prevent discriminator overfitting, and (iii) global-local discriminator heads for multi-scale quality assessment, and unconditional/conditional training for balanced generation. Additionally, our framework uniquely supports flexible deployment through bottom-up refinement, allowing users to dynamically choose between 1-4 denoising steps with the same model for direct quality-speed trade-offs. Through comprehensive experiments, we demonstrate that NitroFusion significantly outperforms existing single-step methods across multiple evaluation metrics, particularly excelling in preserving fine details and global consistency.

NitroFusion：透過動態對抗訓練實現高保真度單步擴散

NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

摘要

Support