SwiftEdit：透過一步擴散快速進行文字引導的圖像編輯

摘要

最近在文字引導的圖像編輯方面取得了重大進展，使用戶可以通過簡單的文字輸入進行圖像編輯，利用多步驟擴散式文本到圖像模型的廣泛先驗知識。然而，這些方法通常無法滿足實際應用和設備應用所需的速度要求，這是由於涉及昂貴的多步驟反演和採樣過程所導致的。為了應對這一問題，我們引入了SwiftEdit，這是一個簡單而高效的編輯工具，實現了即時的文字引導圖像編輯（在0.23秒內）。SwiftEdit的進步在於其兩個新穎貢獻：一個一步驟反演框架，通過反演實現一步驟圖像重建，以及一個帶有我們提出的注意力重定機制的遮罩引導編輯技術，以執行局部圖像編輯。通過大量實驗來展示SwiftEdit的有效性和效率。特別是，SwiftEdit實現了即時的文字引導圖像編輯，比以前的多步驟方法要快得多（至少快50倍），同時在編輯結果方面保持了競爭力。我們的項目頁面位於：https://swift-edit.github.io/

English

Recent advances in text-guided image editing enable users to perform image edits through simple text inputs, leveraging the extensive priors of multi-step diffusion-based text-to-image models. However, these methods often fall short of the speed demands required for real-world and on-device applications due to the costly multi-step inversion and sampling process involved. In response to this, we introduce SwiftEdit, a simple yet highly efficient editing tool that achieve instant text-guided image editing (in 0.23s). The advancement of SwiftEdit lies in its two novel contributions: a one-step inversion framework that enables one-step image reconstruction via inversion and a mask-guided editing technique with our proposed attention rescaling mechanism to perform localized image editing. Extensive experiments are provided to demonstrate the effectiveness and efficiency of SwiftEdit. In particular, SwiftEdit enables instant text-guided image editing, which is extremely faster than previous multi-step methods (at least 50 times faster) while maintain a competitive performance in editing results. Our project page is at: https://swift-edit.github.io/

SwiftEdit：透過一步擴散快速進行文字引導的圖像編輯

SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion

摘要

Summary

Support

Support