StyleStudio:具有样式选择性控制的文本驱动样式迁移
StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements
December 11, 2024
作者: Mingkun Lei, Xue Song, Beier Zhu, Hao Wang, Chi Zhang
cs.AI
摘要
基于文本驱动的风格转移旨在将参考图像的风格与文本提示描述的内容合并。最近文本到图像模型的进展提高了风格转换的微妙程度,然而仍然存在重大挑战,特别是过度拟合参考风格、限制风格控制以及与文本内容不匹配。在本文中,我们提出了三种互补策略来解决这些问题。首先,我们引入了一种跨模态自适应实例归一化(AdaIN)机制,以更好地整合风格和文本特征,增强对齐。其次,我们开发了一种基于风格的无分类器引导(SCFG)方法,可以有选择性地控制风格元素,减少无关影响。最后,在早期生成阶段我们引入了一个教师模型,以稳定空间布局并减少伪影。我们的广泛评估表明,在风格转移质量和与文本提示的对齐方面取得了显著改进。此外,我们的方法可以集成到现有的风格转移框架中而无需微调。
English
Text-driven style transfer aims to merge the style of a reference image with
content described by a text prompt. Recent advancements in text-to-image models
have improved the nuance of style transformations, yet significant challenges
remain, particularly with overfitting to reference styles, limiting stylistic
control, and misaligning with textual content. In this paper, we propose three
complementary strategies to address these issues. First, we introduce a
cross-modal Adaptive Instance Normalization (AdaIN) mechanism for better
integration of style and text features, enhancing alignment. Second, we develop
a Style-based Classifier-Free Guidance (SCFG) approach that enables selective
control over stylistic elements, reducing irrelevant influences. Finally, we
incorporate a teacher model during early generation stages to stabilize spatial
layouts and mitigate artifacts. Our extensive evaluations demonstrate
significant improvements in style transfer quality and alignment with textual
prompts. Furthermore, our approach can be integrated into existing style
transfer frameworks without fine-tuning.Summary
AI-Generated Summary