FlowEdit:使用預訓練流模型進行無反轉的基於文本的編輯
FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models
December 11, 2024
作者: Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli
cs.AI
摘要
使用預先訓練的文本到圖像(T2I)擴散/流模型來編輯真實圖像通常涉及將圖像反轉為其對應的噪聲地圖。然而,僅僅通過反轉通常無法獲得滿意的結果,因此許多方法在採樣過程中額外介入。這些方法可以獲得改善的結果,但在不同模型架構之間並不無縫適用。在這裡,我們介紹FlowEdit,這是一種針對預先訓練的T2I流模型的基於文本的編輯方法,它無需反轉、無需優化,並且與模型無關。我們的方法構建了一個常微分方程(ODE),直接映射源分佈和目標分佈(對應於源文本提示和目標文本提示),實現比反轉方法更低的運輸成本。這導致了最先進的結果,我們以Stable Diffusion 3和FLUX為例進行了說明。代碼和示例可在該項目的網頁上找到。
English
Editing real images using a pre-trained text-to-image (T2I) diffusion/flow
model often involves inverting the image into its corresponding noise map.
However, inversion by itself is typically insufficient for obtaining
satisfactory results, and therefore many methods additionally intervene in the
sampling process. Such methods achieve improved results but are not seamlessly
transferable between model architectures. Here, we introduce FlowEdit, a
text-based editing method for pre-trained T2I flow models, which is
inversion-free, optimization-free and model agnostic. Our method constructs an
ODE that directly maps between the source and target distributions
(corresponding to the source and target text prompts) and achieves a lower
transport cost than the inversion approach. This leads to state-of-the-art
results, as we illustrate with Stable Diffusion 3 and FLUX. Code and examples
are available on the project's webpage.Summary
AI-Generated Summary