負面標記合併:基於圖像的對抗特徵引導

Negative Token Merging: Image-based Adversarial Feature Guidance

December 2, 2024
作者: Jaskirat Singh, Lindsey Li, Weijia Shi, Ranjay Krishna, Yejin Choi, Pang Wei Koh, Michael F. Cohen, Stephen Gould, Liang Zheng, Luke Zettlemoyer
cs.AI

摘要

基於文字的對抗引導使用負面提示已成為一種廣泛採用的方法,可將輸出特徵推遠離不需要的概念。儘管有用,但僅使用文字進行對抗引導可能不足以捕捉複雜的視覺概念,並避免不需要的視覺元素,如受版權保護的角色。本文首次探索了一種在這個方向上使用替代模態的方法,通過直接使用參考圖像或批次中的其他圖像的視覺特徵來進行對抗引導。具體而言,我們引入了負面標記合併(NegToMe),這是一種簡單但有效的無需訓練的方法,通過在反向擴散過程中有選擇性地推開匹配的語義特徵(參考和輸出生成之間)來進行對抗引導。當與同一批次中的其他圖像一起使用時,我們觀察到NegToMe顯著增加了輸出的多樣性(種族、性別、視覺),而不會犧牲輸出圖像的質量。同樣,當針對參考的受版權資產使用時,NegToMe有助於將與受版權內容的視覺相似性降低34.57%。NegToMe易於實施,只需幾行代碼,推理時間僅略高於(<4%),並且適用於不原生支持使用單獨負面提示的不同擴散架構,如Flux。代碼可在https://negtome.github.io獲得。
English
Text-based adversarial guidance using a negative prompt has emerged as a widely adopted approach to push the output features away from undesired concepts. While useful, performing adversarial guidance using text alone can be insufficient to capture complex visual concepts and avoid undesired visual elements like copyrighted characters. In this paper, for the first time we explore an alternate modality in this direction by performing adversarial guidance directly using visual features from a reference image or other images in a batch. In particular, we introduce negative token merging (NegToMe), a simple but effective training-free approach which performs adversarial guidance by selectively pushing apart matching semantic features (between reference and output generation) during the reverse diffusion process. When used w.r.t. other images in the same batch, we observe that NegToMe significantly increases output diversity (racial, gender, visual) without sacrificing output image quality. Similarly, when used w.r.t. a reference copyrighted asset, NegToMe helps reduce visual similarity with copyrighted content by 34.57%. NegToMe is simple to implement using just few-lines of code, uses only marginally higher (<4%) inference times and generalizes to different diffusion architectures like Flux, which do not natively support the use of a separate negative prompt. Code is available at https://negtome.github.io

Summary

AI-Generated Summary

PDF226December 6, 2024