MagicQuill：一個智能互動式圖像編輯系統

摘要

圖像編輯涉及各種複雜任務，需要高效和精確的操作技巧。本文介紹了MagicQuill，一個整合的圖像編輯系統，能夠快速實現創意想法。我們的系統具有簡潔而功能強大的界面，可通過最少的輸入進行編輯操作（例如插入元素、擦除物件、改變顏色）。這些互動由多模式大型語言模型（MLLM）監控，以實時預測編輯意圖，無需明確提示輸入。最後，我們應用了一個強大的擴散先驗，通過精心學習的雙分支插件模塊進行處理，實現精確控制的編輯請求。實驗結果證明了MagicQuill在實現高質量圖像編輯方面的有效性。請訪問https://magic-quill.github.io 以試用我們的系統。

English

Image editing involves a variety of complex tasks and requires efficient and precise manipulation techniques. In this paper, we present MagicQuill, an integrated image editing system that enables swift actualization of creative ideas. Our system features a streamlined yet functionally robust interface, allowing for the articulation of editing operations (e.g., inserting elements, erasing objects, altering color) with minimal input. These interactions are monitored by a multimodal large language model (MLLM) to anticipate editing intentions in real time, bypassing the need for explicit prompt entry. Finally, we apply a powerful diffusion prior, enhanced by a carefully learned two-branch plug-in module, to process editing requests with precise control. Experimental results demonstrate the effectiveness of MagicQuill in achieving high-quality image edits. Please visit https://magic-quill.github.io to try out our system.

MagicQuill：一個智能互動式圖像編輯系統

MagicQuill: An Intelligent Interactive Image Editing System

摘要

Summary

Support