ChatPaper.aiChatPaper

具有修复功能的大规模文本到图像模型是一种零样本主题驱动的图像生成器。

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

November 23, 2024
作者: Chaehun Shin, Jooyoung Choi, Heeseung Kim, Sungroh Yoon
cs.AI

摘要

主题驱动的文本到图像生成旨在通过准确捕捉主题的视觉特征和文本提示的语义内容,在所需的背景下生成新主题的图像。传统方法依赖于耗时且资源密集的微调以实现主题对齐,而最近的零样本方法则利用即时图像提示,通常会牺牲主题对齐。在本文中,我们介绍了Diptych Prompting,这是一种新颖的零样本方法,通过利用大规模文本到图像模型中二联画生成的新兴特性,将其重新解释为一个具有精确主题对齐的修补任务。Diptych Prompting将一个不完整的二联画与参考图像放在左侧面板,并在右侧面板上执行文本条件修补。我们进一步通过去除参考图像中的背景来防止不必要的内容泄漏,并通过在修补过程中增强面板之间的注意力权重来改善生成主题的细节。实验结果证实,我们的方法明显优于零样本图像提示方法,生成的图像在视觉上更受用户喜爱。此外,我们的方法不仅支持主题驱动生成,还支持风格化图像生成和主题驱动图像编辑,展示了在各种图像生成应用中的多功能性。项目页面:https://diptychprompting.github.io/
English
Subject-driven text-to-image generation aims to produce images of a new subject within a desired context by accurately capturing both the visual characteristics of the subject and the semantic content of a text prompt. Traditional methods rely on time- and resource-intensive fine-tuning for subject alignment, while recent zero-shot approaches leverage on-the-fly image prompting, often sacrificing subject alignment. In this paper, we introduce Diptych Prompting, a novel zero-shot approach that reinterprets as an inpainting task with precise subject alignment by leveraging the emergent property of diptych generation in large-scale text-to-image models. Diptych Prompting arranges an incomplete diptych with the reference image in the left panel, and performs text-conditioned inpainting on the right panel. We further prevent unwanted content leakage by removing the background in the reference image and improve fine-grained details in the generated subject by enhancing attention weights between the panels during inpainting. Experimental results confirm that our approach significantly outperforms zero-shot image prompting methods, resulting in images that are visually preferred by users. Additionally, our method supports not only subject-driven generation but also stylized image generation and subject-driven image editing, demonstrating versatility across diverse image generation applications. Project page: https://diptychprompting.github.io/

Summary

AI-Generated Summary

PDF332November 26, 2024