MIVE:多实例视频编辑的新设计与基准。
MIVE: New Design and Benchmark for Multi-Instance Video Editing
December 17, 2024
作者: Samuel Teodoro, Agus Gunawan, Soo Ye Kim, Jihyong Oh, Munchurl Kim
cs.AI
摘要
最近基于人工智能的视频编辑使用户能够通过简单的文本提示编辑视频,极大地简化了编辑过程。然而,最近的零样本视频编辑技术主要集中在全局或单个对象的编辑上,这可能导致视频其他部分的意外更改。当多个对象需要局部编辑时,现有方法面临挑战,如编辑不忠实、编辑泄漏以及缺乏合适的评估数据集和指标。为了克服这些限制,我们提出了一种零样本多实例视频编辑框架,称为MIVE。MIVE是一个通用的基于掩模的框架,不专门针对特定对象(例如人)。MIVE引入了两个关键模块:(i)解耦的多实例采样(DMS)以防止编辑泄漏,以及(ii)实例中心的概率重分布(IPR)以确保精确的定位和忠实的编辑。此外,我们提出了新的MIVE数据集,展示了多样化的视频场景,并引入了交实例准确度(CIA)分数来评估多实例视频编辑任务中的编辑泄漏。我们广泛的定性、定量和用户研究评估表明,MIVE在编辑忠实度、准确性和泄漏预防方面明显优于最近的最先进方法,为多实例视频编辑设定了新的基准。项目页面位于https://kaist-viclab.github.io/mive-site/。
English
Recent AI-based video editing has enabled users to edit videos through simple
text prompts, significantly simplifying the editing process. However, recent
zero-shot video editing techniques primarily focus on global or single-object
edits, which can lead to unintended changes in other parts of the video. When
multiple objects require localized edits, existing methods face challenges,
such as unfaithful editing, editing leakage, and lack of suitable evaluation
datasets and metrics. To overcome these limitations, we propose a zero-shot
Multi-Instance Video Editing
framework, called MIVE. MIVE is a general-purpose mask-based framework, not
dedicated to specific objects (e.g., people). MIVE introduces two key modules:
(i) Disentangled Multi-instance Sampling (DMS) to prevent editing leakage and
(ii) Instance-centric Probability Redistribution (IPR) to ensure precise
localization and faithful editing. Additionally, we present our new MIVE
Dataset featuring diverse video scenarios and introduce the Cross-Instance
Accuracy (CIA) Score to evaluate editing leakage in multi-instance video
editing tasks. Our extensive qualitative, quantitative, and user study
evaluations demonstrate that MIVE significantly outperforms recent
state-of-the-art methods in terms of editing faithfulness, accuracy, and
leakage prevention, setting a new benchmark for multi-instance video editing.
The project page is available at https://kaist-viclab.github.io/mive-site/Summary
AI-Generated Summary