MIVE:多實例影片編輯的新設計與基準。

MIVE: New Design and Benchmark for Multi-Instance Video Editing

December 17, 2024
作者: Samuel Teodoro, Agus Gunawan, Soo Ye Kim, Jihyong Oh, Munchurl Kim
cs.AI

摘要

最近基於人工智慧的影片編輯使使用者能夠透過簡單的文字提示來編輯影片,顯著簡化了編輯過程。然而,最近的零樣本影片編輯技術主要集中在全局或單個對象的編輯,這可能導致影片其他部分的意外更改。當多個對象需要局部編輯時,現有方法面臨挑戰,例如不忠實的編輯、編輯泄漏以及缺乏適合的評估數據集和指標。為了克服這些限制,我們提出了一個零樣本多實例影片編輯框架,名為MIVE。MIVE是一個通用的基於遮罩的框架,不專門針對特定對象(例如人)。MIVE引入了兩個關鍵模塊:(i)解耦多實例採樣(DMS)以防止編輯泄漏,以及(ii)實例中心概率重分配(IPR)以確保精確的定位和忠實的編輯。此外,我們提出了新的MIVE數據集,展示了多樣的影片場景,並引入了交實例準確度(CIA)分數來評估多實例影片編輯任務中的編輯泄漏。我們廣泛的定性、定量和用戶研究評估表明,MIVE在編輯忠實度、準確性和泄漏防止方面顯著優於最近的最先進方法,為多實例影片編輯設定了新的基準。項目頁面位於https://kaist-viclab.github.io/mive-site/。
English
Recent AI-based video editing has enabled users to edit videos through simple text prompts, significantly simplifying the editing process. However, recent zero-shot video editing techniques primarily focus on global or single-object edits, which can lead to unintended changes in other parts of the video. When multiple objects require localized edits, existing methods face challenges, such as unfaithful editing, editing leakage, and lack of suitable evaluation datasets and metrics. To overcome these limitations, we propose a zero-shot Multi-Instance Video Editing framework, called MIVE. MIVE is a general-purpose mask-based framework, not dedicated to specific objects (e.g., people). MIVE introduces two key modules: (i) Disentangled Multi-instance Sampling (DMS) to prevent editing leakage and (ii) Instance-centric Probability Redistribution (IPR) to ensure precise localization and faithful editing. Additionally, we present our new MIVE Dataset featuring diverse video scenarios and introduce the Cross-Instance Accuracy (CIA) Score to evaluate editing leakage in multi-instance video editing tasks. Our extensive qualitative, quantitative, and user study evaluations demonstrate that MIVE significantly outperforms recent state-of-the-art methods in terms of editing faithfulness, accuracy, and leakage prevention, setting a new benchmark for multi-instance video editing. The project page is available at https://kaist-viclab.github.io/mive-site/

Summary

AI-Generated Summary

PDF42December 18, 2024