ChatPaper.aiChatPaper

面向样式驱动生成的样式友好信噪比采样器

Style-Friendly SNR Sampler for Style-Driven Generation

November 22, 2024
作者: Jooyoung Choi, Chaehun Shin, Yeongtak Oh, Heeseung Kim, Sungroh Yoon
cs.AI

摘要

最近的大规模扩散模型能够生成高质量图像,但在学习新的、个性化的艺术风格方面存在困难,这限制了独特风格模板的创作。利用参考图像进行微调是最有前途的方法,但往往盲目地利用了用于预训练的目标和噪声水平分布,导致次优的风格对齐。我们提出了友好风格的信噪比采样器,它在微调过程中积极地将信噪比(SNR)分布转向更高的噪声水平,以便专注于风格特征出现的噪声水平。这使模型能够更好地捕捉独特风格,并生成风格对齐度更高的图像。我们的方法使扩散模型能够学习和共享新的“风格模板”,增强个性化内容创作。我们展示了生成个人水彩画、极简扁平漫画、3D 渲染、多面板图像和带文本的表情包等风格的能力,从而拓宽了风格驱动生成的范围。
English
Recent large-scale diffusion models generate high-quality images but struggle to learn new, personalized artistic styles, which limits the creation of unique style templates. Fine-tuning with reference images is the most promising approach, but it often blindly utilizes objectives and noise level distributions used for pre-training, leading to suboptimal style alignment. We propose the Style-friendly SNR sampler, which aggressively shifts the signal-to-noise ratio (SNR) distribution toward higher noise levels during fine-tuning to focus on noise levels where stylistic features emerge. This enables models to better capture unique styles and generate images with higher style alignment. Our method allows diffusion models to learn and share new "style templates", enhancing personalized content creation. We demonstrate the ability to generate styles such as personal watercolor paintings, minimal flat cartoons, 3D renderings, multi-panel images, and memes with text, thereby broadening the scope of style-driven generation.

Summary

AI-Generated Summary

PDF353November 25, 2024