FlowEdit:使用預訓練流模型進行無反轉的基於文本的編輯

FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

December 11, 2024
作者: Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli
cs.AI

摘要

使用預先訓練的文本到圖像(T2I)擴散/流模型來編輯真實圖像通常涉及將圖像反轉為其對應的噪聲地圖。然而,僅僅通過反轉通常無法獲得滿意的結果,因此許多方法在採樣過程中額外介入。這些方法可以獲得改善的結果,但在不同模型架構之間並不無縫適用。在這裡,我們介紹FlowEdit,這是一種針對預先訓練的T2I流模型的基於文本的編輯方法,它無需反轉、無需優化,並且與模型無關。我們的方法構建了一個常微分方程(ODE),直接映射源分佈和目標分佈(對應於源文本提示和目標文本提示),實現比反轉方法更低的運輸成本。這導致了最先進的結果,我們以Stable Diffusion 3和FLUX為例進行了說明。代碼和示例可在該項目的網頁上找到。
English
Editing real images using a pre-trained text-to-image (T2I) diffusion/flow model often involves inverting the image into its corresponding noise map. However, inversion by itself is typically insufficient for obtaining satisfactory results, and therefore many methods additionally intervene in the sampling process. Such methods achieve improved results but are not seamlessly transferable between model architectures. Here, we introduce FlowEdit, a text-based editing method for pre-trained T2I flow models, which is inversion-free, optimization-free and model agnostic. Our method constructs an ODE that directly maps between the source and target distributions (corresponding to the source and target text prompts) and achieves a lower transport cost than the inversion approach. This leads to state-of-the-art results, as we illustrate with Stable Diffusion 3 and FLUX. Code and examples are available on the project's webpage.

Summary

AI-Generated Summary

PDF124December 12, 2024