NitroFusion：通过动态对抗训练实现高保真度的单步扩散

摘要

我们介绍了NitroFusion，这是一种根本不同的单步扩散方法，通过动态对抗框架实现高质量生成。虽然一步方法具有明显的速度优势，但通常与多步方法相比存在质量下降的问题。就像一组艺术评论家通过专门从事构图、色彩和技术等不同方面提供全面反馈一样，我们的方法保持了一个大型的专门鉴别器头部池，共同引导生成过程。每个鉴别器组在不同噪声水平上针对特定质量方面发展专业知识，提供多样化反馈，实现高保真度的一步生成。我们的框架结合了：（i）具有专门鉴别器组的动态鉴别器池，以提高生成质量，（ii）策略性刷新机制以防止鉴别器过拟合，以及（iii）用于多尺度质量评估的全局-局部鉴别器头部，以及无条件/有条件训练以实现平衡生成。此外，我们的框架独特地支持通过自下而上的细化灵活部署，允许用户在直接质量-速度权衡中动态选择1-4个去噪步骤使用相同模型。通过全面实验，我们展示了NitroFusion在多个评估指标上明显优于现有的单步方法，特别擅长保留细节和全局一致性。

English

We introduce NitroFusion, a fundamentally different approach to single-step diffusion that achieves high-quality generation through a dynamic adversarial framework. While one-step methods offer dramatic speed advantages, they typically suffer from quality degradation compared to their multi-step counterparts. Just as a panel of art critics provides comprehensive feedback by specializing in different aspects like composition, color, and technique, our approach maintains a large pool of specialized discriminator heads that collectively guide the generation process. Each discriminator group develops expertise in specific quality aspects at different noise levels, providing diverse feedback that enables high-fidelity one-step generation. Our framework combines: (i) a dynamic discriminator pool with specialized discriminator groups to improve generation quality, (ii) strategic refresh mechanisms to prevent discriminator overfitting, and (iii) global-local discriminator heads for multi-scale quality assessment, and unconditional/conditional training for balanced generation. Additionally, our framework uniquely supports flexible deployment through bottom-up refinement, allowing users to dynamically choose between 1-4 denoising steps with the same model for direct quality-speed trade-offs. Through comprehensive experiments, we demonstrate that NitroFusion significantly outperforms existing single-step methods across multiple evaluation metrics, particularly excelling in preserving fine details and global consistency.

NitroFusion：通过动态对抗训练实现高保真度的单步扩散

NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

摘要

Summary

Support

Support