NitroFusion:通过动态对抗训练实现高保真度的单步扩散
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training
December 2, 2024
作者: Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou, Yi-Zhe Song
cs.AI
摘要
我们介绍了NitroFusion,这是一种根本不同的单步扩散方法,通过动态对抗框架实现高质量生成。虽然一步方法具有明显的速度优势,但通常与多步方法相比存在质量下降的问题。就像一组艺术评论家通过专门从事构图、色彩和技术等不同方面提供全面反馈一样,我们的方法保持了一个大型的专门鉴别器头部池,共同引导生成过程。每个鉴别器组在不同噪声水平上针对特定质量方面发展专业知识,提供多样化反馈,实现高保真度的一步生成。我们的框架结合了:(i)具有专门鉴别器组的动态鉴别器池,以提高生成质量,(ii)策略性刷新机制以防止鉴别器过拟合,以及(iii)用于多尺度质量评估的全局-局部鉴别器头部,以及无条件/有条件训练以实现平衡生成。此外,我们的框架独特地支持通过自下而上的细化灵活部署,允许用户在直接质量-速度权衡中动态选择1-4个去噪步骤使用相同模型。通过全面实验,我们展示了NitroFusion在多个评估指标上明显优于现有的单步方法,特别擅长保留细节和全局一致性。
English
We introduce NitroFusion, a fundamentally different approach to single-step
diffusion that achieves high-quality generation through a dynamic adversarial
framework. While one-step methods offer dramatic speed advantages, they
typically suffer from quality degradation compared to their multi-step
counterparts. Just as a panel of art critics provides comprehensive feedback by
specializing in different aspects like composition, color, and technique, our
approach maintains a large pool of specialized discriminator heads that
collectively guide the generation process. Each discriminator group develops
expertise in specific quality aspects at different noise levels, providing
diverse feedback that enables high-fidelity one-step generation. Our framework
combines: (i) a dynamic discriminator pool with specialized discriminator
groups to improve generation quality, (ii) strategic refresh mechanisms to
prevent discriminator overfitting, and (iii) global-local discriminator heads
for multi-scale quality assessment, and unconditional/conditional training for
balanced generation. Additionally, our framework uniquely supports flexible
deployment through bottom-up refinement, allowing users to dynamically choose
between 1-4 denoising steps with the same model for direct quality-speed
trade-offs. Through comprehensive experiments, we demonstrate that NitroFusion
significantly outperforms existing single-step methods across multiple
evaluation metrics, particularly excelling in preserving fine details and
global consistency.Summary
AI-Generated Summary