매직퀼: 지능적 대화형 이미지 편집 시스템

초록

이미지 편집은 다양한 복잡한 작업을 포함하며 효율적이고 정확한 조작 기술이 필요합니다. 본 논문에서는 창의적인 아이디어를 신속하게 구현할 수 있는 통합 이미지 편집 시스템인 MagicQuill을 제안합니다. 저희 시스템은 간소화되었지만 기능적으로 견고한 인터페이스를 특징으로 하며, 최소한의 입력으로 편집 작업(예: 요소 삽입, 객체 삭제, 색상 변경)을 표현할 수 있습니다. 이러한 상호작용은 실시간으로 편집 의도를 예측하기 위해 다중 모달 대형 언어 모델(MLLM)에 의해 모니터링되어 명시적인 프롬프트 입력이 필요 없이 진행됩니다. 마지막으로, 정밀한 제어를 위해 강화된 강력한 확산 사전을 적용하고, 신중하게 학습된 이중 분기 플러그인 모듈에 의해 향상된 편집 요청을 처리합니다. 실험 결과는 MagicQuill의 고품질 이미지 편집 성능을 입증합니다. 저희 시스템을 사용해 보려면 https://magic-quill.github.io를 방문해 주세요.

English

Image editing involves a variety of complex tasks and requires efficient and precise manipulation techniques. In this paper, we present MagicQuill, an integrated image editing system that enables swift actualization of creative ideas. Our system features a streamlined yet functionally robust interface, allowing for the articulation of editing operations (e.g., inserting elements, erasing objects, altering color) with minimal input. These interactions are monitored by a multimodal large language model (MLLM) to anticipate editing intentions in real time, bypassing the need for explicit prompt entry. Finally, we apply a powerful diffusion prior, enhanced by a carefully learned two-branch plug-in module, to process editing requests with precise control. Experimental results demonstrate the effectiveness of MagicQuill in achieving high-quality image edits. Please visit https://magic-quill.github.io to try out our system.

매직퀼: 지능적 대화형 이미지 편집 시스템

MagicQuill: An Intelligent Interactive Image Editing System

초록

Support