InstantDrag:提升基於拖曳的圖像編輯互動性
InstantDrag: Improving Interactivity in Drag-based Image Editing
September 13, 2024
作者: Joonghyuk Shin, Daehyeon Choi, Jaesik Park
cs.AI
摘要
基於拖曳的圖像編輯近來因其互動性和精確性而受到矚目。然而,儘管文本生成圖像模型能夠在一秒內生成樣本,由於準確反映用戶互動並保持圖像內容的挑戰,拖曳編輯仍然落後。一些現有方法依賴於計算密集型的每幅圖像優化或複雜的基於引導的方法,需要額外的輸入,如可移動區域的遮罩和文本提示,從而影響了編輯過程的互動性。我們介紹了InstantDrag,這是一個無需優化的流程,提高了互動性和速度,只需要一張圖像和一個拖曳指令作為輸入。InstantDrag由兩個精心設計的網絡組成:一個拖曳條件的光流生成器(FlowGen)和一個光流條件的擴散模型(FlowDiffusion)。InstantDrag通過將任務分解為運動生成和運動條件的圖像生成,從現實世界的視頻數據集中學習了基於拖曳的圖像編輯的運動動態。我們通過在面部視頻數據集和一般場景上的實驗展示了InstantDrag在沒有遮罩或文本提示的情況下執行快速、逼真的編輯的能力。這些結果突顯了我們方法在處理基於拖曳的圖像編輯方面的效率,使其成為互動、實時應用的一個有前途的解決方案。
English
Drag-based image editing has recently gained popularity for its interactivity
and precision. However, despite the ability of text-to-image models to generate
samples within a second, drag editing still lags behind due to the challenge of
accurately reflecting user interaction while maintaining image content. Some
existing approaches rely on computationally intensive per-image optimization or
intricate guidance-based methods, requiring additional inputs such as masks for
movable regions and text prompts, thereby compromising the interactivity of the
editing process. We introduce InstantDrag, an optimization-free pipeline that
enhances interactivity and speed, requiring only an image and a drag
instruction as input. InstantDrag consists of two carefully designed networks:
a drag-conditioned optical flow generator (FlowGen) and an optical
flow-conditioned diffusion model (FlowDiffusion). InstantDrag learns motion
dynamics for drag-based image editing in real-world video datasets by
decomposing the task into motion generation and motion-conditioned image
generation. We demonstrate InstantDrag's capability to perform fast,
photo-realistic edits without masks or text prompts through experiments on
facial video datasets and general scenes. These results highlight the
efficiency of our approach in handling drag-based image editing, making it a
promising solution for interactive, real-time applications.Summary
AI-Generated Summary