MagicQuill：一种智能交互式图像编辑系统

摘要

图像编辑涉及各种复杂任务，需要高效和精确的操作技术。本文介绍了MagicQuill，这是一个集成的图像编辑系统，能够快速实现创意想法。我们的系统具有简化但功能强大的界面，允许以最少的输入进行编辑操作（例如插入元素、擦除对象、改变颜色）。这些交互由一个多模态大语言模型（MLLM）监控，以实时预测编辑意图，无需明确的提示输入。最后，我们应用了一个强大的扩散先验，通过一个精心学习的双分支插件模块增强，以精确控制处理编辑请求。实验结果表明MagicQuill在实现高质量图像编辑方面的有效性。请访问https://magic-quill.github.io 体验我们的系统。

English

Image editing involves a variety of complex tasks and requires efficient and precise manipulation techniques. In this paper, we present MagicQuill, an integrated image editing system that enables swift actualization of creative ideas. Our system features a streamlined yet functionally robust interface, allowing for the articulation of editing operations (e.g., inserting elements, erasing objects, altering color) with minimal input. These interactions are monitored by a multimodal large language model (MLLM) to anticipate editing intentions in real time, bypassing the need for explicit prompt entry. Finally, we apply a powerful diffusion prior, enhanced by a carefully learned two-branch plug-in module, to process editing requests with precise control. Experimental results demonstrate the effectiveness of MagicQuill in achieving high-quality image edits. Please visit https://magic-quill.github.io to try out our system.

MagicQuill：一种智能交互式图像编辑系统

MagicQuill: An Intelligent Interactive Image Editing System

摘要

Support