ChatPaper.aiChatPaper

HiFlow:無需訓練的高分辨率圖像生成與流對齊引導

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

April 8, 2025
作者: Jiazi Bu, Pengyang Ling, Yujie Zhou, Pan Zhang, Tong Wu, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Dahua Lin, Jiaqi Wang
cs.AI

摘要

文本到圖像(T2I)擴散/流模型因其出色的靈活視覺創作能力而近期備受關注。然而,高分辨率圖像合成由於高分辨率內容的稀缺性和複雜性,仍面臨巨大挑戰。為此,我們提出了HiFlow,這是一個無需訓練且模型無關的框架,旨在釋放預訓練流模型的分辨率潛力。具體而言,HiFlow在高分辨率空間內建立了一個虛擬參考流,有效捕捉低分辨率流信息的特徵,並通過三個關鍵方面為高分辨率生成提供指導:初始化對齊以確保低頻一致性,方向對齊以保持結構,以及加速對齊以保證細節保真度。通過利用這種流對齊的指導,HiFlow顯著提升了T2I模型的高分辨率圖像合成質量,並在其個性化變體中展現了廣泛的適用性。大量實驗驗證了HiFlow在實現優越高分辨率圖像質量方面相較於當前最先進方法的顯著優勢。
English
Text-to-image (T2I) diffusion/flow models have drawn considerable attention recently due to their remarkable ability to deliver flexible visual creations. Still, high-resolution image synthesis presents formidable challenges due to the scarcity and complexity of high-resolution content. To this end, we present HiFlow, a training-free and model-agnostic framework to unlock the resolution potential of pre-trained flow models. Specifically, HiFlow establishes a virtual reference flow within the high-resolution space that effectively captures the characteristics of low-resolution flow information, offering guidance for high-resolution generation through three key aspects: initialization alignment for low-frequency consistency, direction alignment for structure preservation, and acceleration alignment for detail fidelity. By leveraging this flow-aligned guidance, HiFlow substantially elevates the quality of high-resolution image synthesis of T2I models and demonstrates versatility across their personalized variants. Extensive experiments validate HiFlow's superiority in achieving superior high-resolution image quality over current state-of-the-art methods.

Summary

AI-Generated Summary

PDF113April 9, 2025