ChatPaper.aiChatPaper

细粒度视频运动控制的轨迹注意力

Trajectory Attention for Fine-grained Video Motion Control

November 28, 2024
作者: Zeqi Xiao, Wenqi Ouyang, Yifan Zhou, Shuai Yang, Lei Yang, Jianlou Si, Xingang Pan
cs.AI

摘要

最近视频生成领域的重大进展在很大程度上受到视频扩散模型的推动,摄像机运动控制作为创建定制视觉内容的关键挑战日益突出。本文介绍了轨迹注意力,这是一种新颖方法,通过沿着可用像素轨迹执行注意力,实现精细的摄像机运动控制。与现有方法经常产生不精确输出或忽视时间相关性的情况不同,我们的方法具有更强的归纳偏差,可以无缝地将轨迹信息注入视频生成过程中。重要的是,我们的方法将轨迹注意力建模为一个辅助分支,与传统的时间注意力并驾齐驱。这种设计使原始的时间注意力和轨迹注意力能够协同工作,确保精确的运动控制和新内容生成能力,这在轨迹仅部分可用时至关重要。对图像和视频的摄像机运动控制实验表明,在保持高质量生成的同时,精度和长距离一致性均有显著改善。此外,我们展示了我们的方法可以扩展到其他视频运动控制任务,如以第一帧为导向的视频编辑,在这些任务中,我们的方法在保持大范围空间和时间一致性方面表现出色。
English
Recent advancements in video generation have been greatly driven by video diffusion models, with camera motion control emerging as a crucial challenge in creating view-customized visual content. This paper introduces trajectory attention, a novel approach that performs attention along available pixel trajectories for fine-grained camera motion control. Unlike existing methods that often yield imprecise outputs or neglect temporal correlations, our approach possesses a stronger inductive bias that seamlessly injects trajectory information into the video generation process. Importantly, our approach models trajectory attention as an auxiliary branch alongside traditional temporal attention. This design enables the original temporal attention and the trajectory attention to work in synergy, ensuring both precise motion control and new content generation capability, which is critical when the trajectory is only partially available. Experiments on camera motion control for images and videos demonstrate significant improvements in precision and long-range consistency while maintaining high-quality generation. Furthermore, we show that our approach can be extended to other video motion control tasks, such as first-frame-guided video editing, where it excels in maintaining content consistency over large spatial and temporal ranges.

Summary

AI-Generated Summary

PDF122December 2, 2024