ChatPaper.aiChatPaper

HiFlow:无需训练的高分辨率图像生成与流对齐引导

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

April 8, 2025
作者: Jiazi Bu, Pengyang Ling, Yujie Zhou, Pan Zhang, Tong Wu, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Dahua Lin, Jiaqi Wang
cs.AI

摘要

文本到图像(T2I)扩散/流模型因其卓越的灵活视觉创作能力而备受瞩目。然而,高分辨率图像合成由于高分辨率内容的稀缺性和复杂性,仍面临巨大挑战。为此,我们提出了HiFlow,一个无需训练且与模型无关的框架,旨在释放预训练流模型的分辨率潜力。具体而言,HiFlow在高分辨率空间内建立了一个虚拟参考流,有效捕捉低分辨率流信息的特征,并通过三个关键方面为高分辨率生成提供指导:初始化对齐以确保低频一致性,方向对齐以保持结构完整性,以及加速对齐以保障细节保真度。通过利用这种流对齐指导,HiFlow显著提升了T2I模型的高分辨率图像合成质量,并在其个性化变体中展现了广泛的适用性。大量实验验证了HiFlow在实现优于当前最先进方法的高分辨率图像质量方面的卓越性能。
English
Text-to-image (T2I) diffusion/flow models have drawn considerable attention recently due to their remarkable ability to deliver flexible visual creations. Still, high-resolution image synthesis presents formidable challenges due to the scarcity and complexity of high-resolution content. To this end, we present HiFlow, a training-free and model-agnostic framework to unlock the resolution potential of pre-trained flow models. Specifically, HiFlow establishes a virtual reference flow within the high-resolution space that effectively captures the characteristics of low-resolution flow information, offering guidance for high-resolution generation through three key aspects: initialization alignment for low-frequency consistency, direction alignment for structure preservation, and acceleration alignment for detail fidelity. By leveraging this flow-aligned guidance, HiFlow substantially elevates the quality of high-resolution image synthesis of T2I models and demonstrates versatility across their personalized variants. Extensive experiments validate HiFlow's superiority in achieving superior high-resolution image quality over current state-of-the-art methods.

Summary

AI-Generated Summary

PDF133April 9, 2025