StyleStudio:具有選擇性風格控制的文本驅動風格轉移

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

December 11, 2024
作者: Mingkun Lei, Xue Song, Beier Zhu, Hao Wang, Chi Zhang
cs.AI

摘要

基於文本的風格轉移旨在將參考圖像的風格與文本提示描述的內容相融合。最近在文本到圖像模型方面的進展提高了風格轉換的微妙性,但仍存在重大挑戰,特別是對參考風格的過度擬合,限制風格控制以及與文本內容不一致。在本文中,我們提出三種互補策略來應對這些問題。首先,我們引入了一種跨模態自適應實例標準化(AdaIN)機制,以更好地整合風格和文本特徵,增強對齊。其次,我們開發了一種基於風格的無分類器引導(SCFG)方法,可以對風格元素進行選擇性控制,減少無關的影響。最後,在早期生成階段引入了一個教師模型,以穩定空間佈局並減輕瑕疵。我們的廣泛評估表明,在風格轉移質量和與文本提示的對齊方面取得了顯著改進。此外,我們的方法可以集成到現有的風格轉移框架中,無需進行微調。
English
Text-driven style transfer aims to merge the style of a reference image with content described by a text prompt. Recent advancements in text-to-image models have improved the nuance of style transformations, yet significant challenges remain, particularly with overfitting to reference styles, limiting stylistic control, and misaligning with textual content. In this paper, we propose three complementary strategies to address these issues. First, we introduce a cross-modal Adaptive Instance Normalization (AdaIN) mechanism for better integration of style and text features, enhancing alignment. Second, we develop a Style-based Classifier-Free Guidance (SCFG) approach that enables selective control over stylistic elements, reducing irrelevant influences. Finally, we incorporate a teacher model during early generation stages to stabilize spatial layouts and mitigate artifacts. Our extensive evaluations demonstrate significant improvements in style transfer quality and alignment with textual prompts. Furthermore, our approach can be integrated into existing style transfer frameworks without fine-tuning.

Summary

AI-Generated Summary

PDF82December 12, 2024