ObjCtrl-2.5D:使用相機姿態的無需訓練物體控制
ObjCtrl-2.5D: Training-free Object Control with Camera Poses
December 10, 2024
作者: Zhouxia Wang, Yushi Lan, Shangchen Zhou, Chen Change Loy
cs.AI
摘要
本研究旨在實現在影像到影片(I2V)生成中更精確且多功能的物件控制。目前的方法通常使用2D軌跡來表示目標物件的空間移動,但這往往無法捕捉用戶意圖,並且經常產生不自然的結果。為了增強控制能力,我們提出了ObjCtrl-2.5D,這是一種無需訓練的物件控制方法,它使用一個包含深度信息的3D軌跡,作為控制信號,擴展自2D軌跡。通過將物件移動建模為相機移動,ObjCtrl-2.5D將3D軌跡表示為一系列相機姿勢,從而使得可以使用現有的相機運動控制I2V生成模型(CMC-I2V)來控制物件運動,而無需進行訓練。為了使最初設計用於全局運動控制的CMC-I2V模型能夠處理局部物件運動,我們引入了一個模塊來從背景中分離目標物件,實現獨立的局部控制。此外,我們設計了一種有效的方法,通過在幀之間共享物件區域內的低頻變形潛在,來實現更準確的物件控制。大量實驗表明,與無需訓練的方法相比,ObjCtrl-2.5D顯著提高了物件控制的準確性,並且比使用2D軌跡的基於訓練的方法提供了更多樣化的控制能力,從而實現諸如物件旋轉等複雜效果。代碼和結果可在https://wzhouxiff.github.io/projects/ObjCtrl-2.5D/找到。
English
This study aims to achieve more precise and versatile object control in
image-to-video (I2V) generation. Current methods typically represent the
spatial movement of target objects with 2D trajectories, which often fail to
capture user intention and frequently produce unnatural results. To enhance
control, we present ObjCtrl-2.5D, a training-free object control approach that
uses a 3D trajectory, extended from a 2D trajectory with depth information, as
a control signal. By modeling object movement as camera movement, ObjCtrl-2.5D
represents the 3D trajectory as a sequence of camera poses, enabling object
motion control using an existing camera motion control I2V generation model
(CMC-I2V) without training. To adapt the CMC-I2V model originally designed for
global motion control to handle local object motion, we introduce a module to
isolate the target object from the background, enabling independent local
control. In addition, we devise an effective way to achieve more accurate
object control by sharing low-frequency warped latent within the object's
region across frames. Extensive experiments demonstrate that ObjCtrl-2.5D
significantly improves object control accuracy compared to training-free
methods and offers more diverse control capabilities than training-based
approaches using 2D trajectories, enabling complex effects like object
rotation. Code and results are available at
https://wzhouxiff.github.io/projects/ObjCtrl-2.5D/.Summary
AI-Generated Summary