SwiftEdit：通过一步扩散实现闪电般快速的文本引导图像编辑

摘要

最近在文本引导的图像编辑方面取得了新进展，使用户能够通过简单的文本输入进行图像编辑，利用基于多步扩散的文本到图像模型的广泛先验知识。然而，这些方法通常无法满足实际应用和设备端应用所需的速度要求，因为涉及昂贵的多步反演和采样过程。为了解决这个问题，我们引入了SwiftEdit，这是一个简单而高效的编辑工具，实现了即时的文本引导图像编辑（0.23秒内）。SwiftEdit的进步在于其两个创新贡献：一种一步反演框架，通过反演实现一步图像重建，以及一种基于蒙版引导的编辑技术，结合我们提出的注意力重缩放机制来执行局部图像编辑。我们提供了大量实验证明了SwiftEdit的有效性和效率。特别是，SwiftEdit实现了即时的文本引导图像编辑，比先前的多步方法快得多（至少快50倍），同时在编辑结果上保持了竞争力。我们的项目页面位于：https://swift-edit.github.io/

English

Recent advances in text-guided image editing enable users to perform image edits through simple text inputs, leveraging the extensive priors of multi-step diffusion-based text-to-image models. However, these methods often fall short of the speed demands required for real-world and on-device applications due to the costly multi-step inversion and sampling process involved. In response to this, we introduce SwiftEdit, a simple yet highly efficient editing tool that achieve instant text-guided image editing (in 0.23s). The advancement of SwiftEdit lies in its two novel contributions: a one-step inversion framework that enables one-step image reconstruction via inversion and a mask-guided editing technique with our proposed attention rescaling mechanism to perform localized image editing. Extensive experiments are provided to demonstrate the effectiveness and efficiency of SwiftEdit. In particular, SwiftEdit enables instant text-guided image editing, which is extremely faster than previous multi-step methods (at least 50 times faster) while maintain a competitive performance in editing results. Our project page is at: https://swift-edit.github.io/

SwiftEdit：通过一步扩散实现闪电般快速的文本引导图像编辑

SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion

摘要

Summary

Support

Support